You are working as a marketing analyst for the company Delstra, a leading telecommunications provider in Australia. Delstra is concerned about their relatively high churn ratio. The churn ratio describes the percentage of customers who churn in a certain period.
You are assigned to a project team that has been tasked with reducing customer churn. The team decides that your analytical skills could be highly relevant to address the task, given that Delstra has excellent customer data. In particular, you all agree that you should develop a customer churn prediction model using logistic regression analysis so that Delstra can identify the customers who are most likely to churn.
You have a data set that has been randomly drawn from the customer base in the same month and contains 7043 observations. Each observation (row) represents one customer.
The names of the variables in the data set are self-explanatory. In particular, the data set includes information about:
Customers who left within the last month: Churn_Yes indicates that a particular customer has left. The variable can only take the values 0 or 1 (where 1 indicates churn).
Services used by each customer: phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
Customer account information: how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
Demographic info about customers: gender, age range, and if they have partners and dependents
The data set has been cleaned already. So you do not have to perform any data preparation or data cleaning. Please do not exclude any observations from the data (e.g., because of outliers) but work with the full sample.
Please follow the instructions in the questions.
You can achieve a maximum of 100 marks.
The submission has to be made via the Moodle submission link! Please make sure that you read the instructions provided with the submission link carefully!