7 분 소요

[Notice] [Marketing_Mix_Modeling]

Marketing mix modeling through sales data analysis

Data

Data is (https://www.kaggle.com/sazid28/advertising.csv/).


  • TV - TV media fee.

  • radio - Radio media fee.

  • newspaper - Newspaper media fee.

  • sales - Sales.


Explore data

import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('Data/Advertising.csv')
print(df.shape)
df.tail() 
(200, 5)
Unnamed: 0 TV radio newspaper sales
195 196 38.2 3.7 13.8 7.6
196 197 94.2 4.9 8.1 9.7
197 198 177.0 9.3 6.4 12.8
198 199 283.6 42.0 66.2 25.5
199 200 232.1 8.6 8.7 13.4
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   Unnamed: 0  200 non-null    int64  
 1   TV          200 non-null    float64
 2   radio       200 non-null    float64
 3   newspaper   200 non-null    float64
 4   sales       200 non-null    float64
dtypes: float64(4), int64(1)
memory usage: 7.9 KB
df = df[["TV", "radio", "newspaper", "sales"]]
print(df.shape)
df.tail()
(200, 4)
TV radio newspaper sales
195 38.2 3.7 13.8 7.6
196 94.2 4.9 8.1 9.7
197 177.0 9.3 6.4 12.8
198 283.6 42.0 66.2 25.5
199 232.1 8.6 8.7 13.4
df.describe()
TV radio newspaper sales
count 200.000000 200.000000 200.000000 200.000000
mean 147.042500 23.264000 30.554000 14.022500
std 85.854236 14.846809 21.778621 5.217457
min 0.700000 0.000000 0.300000 1.600000
25% 74.375000 9.975000 12.750000 10.375000
50% 149.750000 22.900000 25.750000 12.900000
75% 218.825000 36.525000 45.100000 17.400000
max 296.400000 49.600000 114.000000 27.000000
df.corr()
TV radio newspaper sales
TV 1.000000 0.054809 0.056648 0.782224
radio 0.054809 1.000000 0.354104 0.576223
newspaper 0.056648 0.354104 1.000000 0.228299
sales 0.782224 0.576223 0.228299 1.000000
corr = df.corr()
sns.heatmap(corr, annot = True)
<AxesSubplot:>

sns.pairplot(df[["TV", "radio", "newspaper", "sales"]])
plt.show()

Labels = df["sales"]
features = df[["TV", "radio", "newspaper"]]
print(Labels.shape)
Labels
(200,)
0      22.1
1      10.4
2       9.3
3      18.5
4      12.9
       ... 
195     7.6
196     9.7
197    12.8
198    25.5
199    13.4
Name: sales, Length: 200, dtype: float64
print(features.shape)
features
(200, 3)
TV radio newspaper
0 230.1 37.8 69.2
1 44.5 39.3 45.1
2 17.2 45.9 69.3
3 151.5 41.3 58.5
4 180.8 10.8 58.4
... ... ... ...
195 38.2 3.7 13.8
196 94.2 4.9 8.1
197 177.0 9.3 6.4
198 283.6 42.0 66.2
199 232.1 8.6 8.7

200 rows × 3 columns


Data analysis

figure, ((ax1, ax2, ax3)) = plt.subplots(nrows=1, ncols=3)

 
figure.set_size_inches(20, 6)

 
sns.scatterplot(data=df, x="TV", y="sales", ax=ax1)
sns.scatterplot(data=df, x="radio", y="sales", ax=ax2)
sns.scatterplot(data=df, x="newspaper", y="sales", ax=ax3)
<AxesSubplot:xlabel='newspaper', ylabel='sales'>

매출액과의 scatter plot을 보면, TV가 매출액과 가장 관련이 높은 것 같아 보입니다.

라디오도 관련이 있지만 신문의 상관관계는 애매해 보입니다.


Linear regression (stats model)

import statsmodels.formula.api as sm
model = sm.ols(formula = "sales ~ TV + radio + newspaper", data = df).fit()

print(model.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  sales   R-squared:                       0.897
Model:                            OLS   Adj. R-squared:                  0.896
Method:                 Least Squares   F-statistic:                     570.3
Date:                Tue, 26 Jul 2022   Prob (F-statistic):           1.58e-96
Time:                        13:51:44   Log-Likelihood:                -386.18
No. Observations:                 200   AIC:                             780.4
Df Residuals:                     196   BIC:                             793.6
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      2.9389      0.312      9.422      0.000       2.324       3.554
TV             0.0458      0.001     32.809      0.000       0.043       0.049
radio          0.1885      0.009     21.893      0.000       0.172       0.206
newspaper     -0.0010      0.006     -0.177      0.860      -0.013       0.011
==============================================================================
Omnibus:                       60.414   Durbin-Watson:                   2.084
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              151.241
Skew:                          -1.327   Prob(JB):                     1.44e-33
Kurtosis:                       6.332   Cond. No.                         454.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Linear regression analysis (sklearn)

from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression

model = LinearRegression().fit(features, Labels)
print(model.intercept_, model.coef_)
2.938889369459412 [ 0.04576465  0.18853002 -0.00103749]
model1 = sm.ols(formula = "sales ~ TV + radio + newspaper", data=df).fit()
model2 = sm.ols(formula = "sales ~ TV + radio", data=df).fit()
model3 = sm.ols(formula = "sales ~ TV", data=df).fit()

print(model1.summary())
print(model2.summary())
print(model3.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  sales   R-squared:                       0.897
Model:                            OLS   Adj. R-squared:                  0.896
Method:                 Least Squares   F-statistic:                     570.3
Date:                Tue, 26 Jul 2022   Prob (F-statistic):           1.58e-96
Time:                        14:29:52   Log-Likelihood:                -386.18
No. Observations:                 200   AIC:                             780.4
Df Residuals:                     196   BIC:                             793.6
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      2.9389      0.312      9.422      0.000       2.324       3.554
TV             0.0458      0.001     32.809      0.000       0.043       0.049
radio          0.1885      0.009     21.893      0.000       0.172       0.206
newspaper     -0.0010      0.006     -0.177      0.860      -0.013       0.011
==============================================================================
Omnibus:                       60.414   Durbin-Watson:                   2.084
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              151.241
Skew:                          -1.327   Prob(JB):                     1.44e-33
Kurtosis:                       6.332   Cond. No.                         454.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  sales   R-squared:                       0.897
Model:                            OLS   Adj. R-squared:                  0.896
Method:                 Least Squares   F-statistic:                     859.6
Date:                Tue, 26 Jul 2022   Prob (F-statistic):           4.83e-98
Time:                        14:29:52   Log-Likelihood:                -386.20
No. Observations:                 200   AIC:                             778.4
Df Residuals:                     197   BIC:                             788.3
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      2.9211      0.294      9.919      0.000       2.340       3.502
TV             0.0458      0.001     32.909      0.000       0.043       0.048
radio          0.1880      0.008     23.382      0.000       0.172       0.204
==============================================================================
Omnibus:                       60.022   Durbin-Watson:                   2.081
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              148.679
Skew:                          -1.323   Prob(JB):                     5.19e-33
Kurtosis:                       6.292   Cond. No.                         425.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  sales   R-squared:                       0.612
Model:                            OLS   Adj. R-squared:                  0.610
Method:                 Least Squares   F-statistic:                     312.1
Date:                Tue, 26 Jul 2022   Prob (F-statistic):           1.47e-42
Time:                        14:29:52   Log-Likelihood:                -519.05
No. Observations:                 200   AIC:                             1042.
Df Residuals:                     198   BIC:                             1049.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      7.0326      0.458     15.360      0.000       6.130       7.935
TV             0.0475      0.003     17.668      0.000       0.042       0.053
==============================================================================
Omnibus:                        0.531   Durbin-Watson:                   1.935
Prob(Omnibus):                  0.767   Jarque-Bera (JB):                0.669
Skew:                          -0.089   Prob(JB):                        0.716
Kurtosis:                       2.779   Cond. No.                         338.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

model1.predict({"TV" : 300, "radio": 10, "newspaper": 4})
0    18.549433
dtype: float64
sales = 2.9389 + 0.0458*300 + 0.1885*10 - 0.001*4
sales
18.5599
model3.predict({"TV":[300, 10, 50]})
0    21.293586
1     7.507960
2     9.409426
dtype: float64

Reanalysis after data transformation

import warnings
warnings.filterwarnings("ignore")
figure, ((ax1, ax2, ax3)) = plt.subplots(nrows=1, ncols=3)
figure.set_size_inches(20, 6)



sns.distplot(df['newspaper'], ax = ax1)
sns.distplot(df['TV'], ax = ax2)
sns.distplot(df['radio'], ax = ax3)
<AxesSubplot:xlabel='radio', ylabel='Density'>

import numpy as np

df["log_newspaper"] = np.log(df["newspaper"] + 1)
print(df.shape)
df[["log_newspaper", "newspaper"]]
(200, 5)
log_newspaper newspaper
0 4.251348 69.2
1 3.830813 45.1
2 4.252772 69.3
3 4.085976 58.5
4 4.084294 58.4
... ... ...
195 2.694627 13.8
196 2.208274 8.1
197 2.001480 6.4
198 4.207673 66.2
199 2.272126 8.7

200 rows × 2 columns

figure, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows=2, ncols=2)

 
figure.set_size_inches(20, 10)


sns.distplot(df["TV"], ax=ax1)
sns.distplot(df["radio"], ax=ax2)
sns.distplot(df["newspaper"], ax=ax3)
sns.distplot(df["log_newspaper"], ax=ax4)
<AxesSubplot:xlabel='log_newspaper', ylabel='Density'>

model1 = sm.ols(formula = "sales ~ TV + radio + newspaper", data=df).fit()

model4 = sm.ols(formula = "sales ~ TV + radio + log_newspaper", data=df).fit()

print(model1.summary())
print(model4.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  sales   R-squared:                       0.897
Model:                            OLS   Adj. R-squared:                  0.896
Method:                 Least Squares   F-statistic:                     570.3
Date:                Tue, 26 Jul 2022   Prob (F-statistic):           1.58e-96
Time:                        14:50:46   Log-Likelihood:                -386.18
No. Observations:                 200   AIC:                             780.4
Df Residuals:                     196   BIC:                             793.6
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      2.9389      0.312      9.422      0.000       2.324       3.554
TV             0.0458      0.001     32.809      0.000       0.043       0.049
radio          0.1885      0.009     21.893      0.000       0.172       0.206
newspaper     -0.0010      0.006     -0.177      0.860      -0.013       0.011
==============================================================================
Omnibus:                       60.414   Durbin-Watson:                   2.084
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              151.241
Skew:                          -1.327   Prob(JB):                     1.44e-33
Kurtosis:                       6.332   Cond. No.                         454.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  sales   R-squared:                       0.897
Model:                            OLS   Adj. R-squared:                  0.896
Method:                 Least Squares   F-statistic:                     570.4
Date:                Tue, 26 Jul 2022   Prob (F-statistic):           1.53e-96
Time:                        14:50:46   Log-Likelihood:                -386.15
No. Observations:                 200   AIC:                             780.3
Df Residuals:                     196   BIC:                             793.5
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
=================================================================================
                    coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------
Intercept         2.8057      0.495      5.671      0.000       1.830       3.782
TV                0.0457      0.001     32.822      0.000       0.043       0.048
radio             0.1874      0.008     22.510      0.000       0.171       0.204
log_newspaper     0.0414      0.142      0.291      0.772      -0.240       0.322
==============================================================================
Omnibus:                       59.391   Durbin-Watson:                   2.074
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              144.867
Skew:                          -1.316   Prob(JB):                     3.49e-32
Kurtosis:                       6.233   Cond. No.                         731.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Interpretation of results

  • The correlation coefficient for newspapers changed from negative to positive, but still not significant at the P-value 0.05 level.

  • newspaper is not shown to be significant. In other words, it can be said that the effect of newspaper advertisements on sales is insignificant.

If you do additional verification

  • Diagnosis of linear regression results:

Equal variance verification through residual distribution, normality verification through QQ plot

Model evaluation of linear regression

  • RMSE, MAE, MSE are mainly used

Application plan

  • It’s 2011 and your boss still thinks newspaper ads are valid, but the data analysis doesn’t.

  • Newspaper advertisements should be stopped, and TV and radio advertisements should be focused.

  • However, TV advertising seems to be a little less cost-effective.

  • Radio advertisements have a higher correlation coefficient. Our products sell well the more they advertise on the radio.

댓글남기기