Data Science Bootcamp Capstone Project: Predicting Open Rates and Click-to-Open Rates for Newsletters using Natural Language Processing

  • Open-rate (OR) : the rate of opened emails out of total delivered
  • Click-to-open rate (CTOR) : the rate of people that open an email but also click a link in the newsletter. It is a strong indicator of the effectiveness of the campaign. Total unique clicks divided by total unique opens
  • Click-rate (CR) : Total number of clicks divided by number of emails delivered
  • Predict high OR and CTOR
  • Identify keywords and therefore user segmentation & behaviour
  • Provide insights to the business to aid in reaching their email marketing goals
  • Rates changed to numeric
  • New feature for OR created
  • Date column reformatted into datetime and day, month, and year extracted
  • Remove outliers
  • LogisticRegression
  • DecisionTreeClassifier
  • RandomForestClassifier
  • KNN Classifier
  • Naive Bayes Bernoulli
  • Naive Bayes Multinomial
  • resumo
  • retrospectiva
  • slipknot
  • nervosa
  • mustaine
  • lennon
  • This project allowed me to research and construct WordClouds
  • It also allowed me to refine some visualisation skills, specifically when visualising the rates with benchmarks and comparing the model scoring
  • Using t-test to check for outliers, user-segmentation and model robustness was a key learning opportunity for me
  • My personal challenges are mentioned under limitations as well: a language barrier, and a limited understanding of email marketing strategy

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store