The course BSAN2205 Machine Learning for Business has three assessment items including a Project Plan, a Project Report and Presentation, and a School-based Take-home Assessment (weighted 20%, 50%, and 30%, respectively). These notes outline my expectations for the Project Plan and introduce the context for the project work. I intend the Plan or proposal to be a formative piece of assessment. The Plan should set the groundwork for your project and project report. I will provide feedback on your Plan that you can incorporate into your project.
Background and Context
In competitive markets, businesses face the challenge of acquiring and retaining customers.
Consider subscription services, for example, subscriptions to digital editions of newspapers and magazines, subscriptions to streaming services (film and television, music, news, sport, etc.), and subscriptions to cable television services (Foxtel). Other businesses face the same challenges, for example, airlines, banks, insurance companies, telecommunication companies, and retailers,restaurants, and personal services businesses. One retention strategy is to deepen relationships with customers through “upselling” – convincing a customer to buy something in addition to or more expensive than that they have previously purchased from a business. Streaming services like Netflix and Spotify strive to build customer “engagement” – increasing the number of downloads and/or the time spent streaming.
Bank marketing provides the specific context for the project. Like many consumer businesses, banks confront the challenges of attracting new customers and retaining existing customers. Strategies for retaining customers provides the setting for the project. For banks, engagement is reflected in the number of products (active accounts) customers maintain. Often retention strategies have the goal of deepening engagement by encouraging customers to open new accounts. Consolidating accounts with one rather than many banks may offer consumers some benefits at the margin. For example,highly engaged customers may be offered lower rates on loans, access to services for which they do not have to pay (at least, not directly), and minimising the overall burden of managing multiple banking relationships. For banks, the benefits of more highly engaged customers are larger and more stable cash flows, lower marketing expenses (with the costs of attracting a customer higher than the costs of retaining a customer, per customer relationship economics), and thus potentially higher profits.
Before moving on, I would like you to appreciate that in problems in business can be solved through effective predictive models of binary outcomes. The decision to purchase or not purchase shares in a company, to acquire or merge with another business, to hire or not hire a prospective employee,etc. All of these decisions involve binary outcomes (in some cases, they can be characterised as “go/no go” decisions). The specific focus of the project is customer acceptance of a marketing offer,but the concepts and models have much broader application.
Aims of the Proposal
The Project Plan has two broad aims. Firstly, the Plan is a marketing document. Second, the Plan is a roadmap. As a marketing document, the Project Plan must sell the project to the stakeholder(s) and/or client. Thus, the Plan should emphasis the emphasis of doing the project. As a proposal or “roadmap,” the Project Plan should outline in some detail the likely direction of the project. This might include identifying the key variables and methods of analysis.
Key Sections of the Project Plan
More specifically, you might consider including the following sections in your Plan.
In the background statement (section 1), you may wish to sketch out the initial motivation for the study. This might include reference to the key stakeholder(s) and/or client. I recommend targeting the proposal at a (hypothetical) client to bring a degree of realism to project and to help focus the project (for example, you could contextualise the study with reference to an Australian bank). In this section, also make sure to sell the project. What are the likely benefits of doing the project, what new insights do you anticipate and how will these improve decision making for example?
You might find value in a section 2 that outlines the conceptual framework for your project work. If you focus your project on customer engagement with banks, for example, you might give some thought to advantages to banks and their customers from greater engagement and the process that might drive customers to respond favourably to a bank’s marketing efforts. My preference is you use your own common sense and logic to define the key concepts and to develop a rationale for their links. I do not expect a review of the literature, but you might find some desk (Google) research helpful in identifying past studies that have explored similar issues to the ones you are. A boxes and arrows diagram might help to illustrate the core concepts and relationships.
The section on variable selection is probably the key section (section 3). Be very specific about the variables you intend to study. In the social science tradition, much emphasis is placed on explaining why the variables selected for study have been selected – the focus is explanation rather than prediction. This is less the case with the data science paradigm with its focus on prediction – business analysts/data scientists may wish to specific a (initial) model that includes all of the possible feature variables. My minimum expectation for this section is that you provide some description of the output and feature variables you intend to study, and why these feature variables.
Section 4 outlines the methods of analysis. Here I would you to be specific about the models you might use to analyse the data. You may have completed the course BSAN2204 Methods of Business Analytics. A focus of that course was predicting a numeric output variable (“song hotness”) using linear regression. For this course (BSAN2205 Machine Learning for Business), our target variable is categorical: it records whether customers opened or did not open a new account in response to the Bank’s marketing efforts. My expectations for section 4 are that you can identify an appropriate statistical model(s) for analysing the data, state something about the assumptions of the model, and perhaps list the key steps in employing the model. You could also write out the specific model you intend estimating (write out the regression equation, for example, with reference to the y– and x-variables).
Section 5 – form of the results – should give an indication of what the outputs might look like. You could do mock-up of the results. You could also say that you will document the results in PowerPoint format and present them verbally. The next steps section concludes the proposal. Here you might remind the client of the core benefits and indicate you need to initialise the project (final client sign-off, for example). You could also add a timeline or perhaps Gantt chart (timetabling the key activities, when you will do them, and identifying any critical paths). At this stage, refrain from doing any statistical analysis of the data – save the analysis for the project reports. Use the Plan to develop some general knowledge of the models you intend to use and sketch out your best plan for the analysis you intend to implement.
The final section of your Plan might address next steps (Section 6). You can briefly restate the main motivation for your Plan and highlight the key “next steps.” Remember the Plan is a marketing document – perhaps remind the reader of the Plan that this project is an important one and should be completed now.
The Bank Marketing Dataset
The project work for this Semester uses the Bank Marketing dataset. Several variations of the dataset exist. There is one variation available from the UCI Machine Learning Repository and another variation on Kaggle. We will use the version of the dataset available from Kaggle (with some minor variations). Owned by Google, Kaggle is an online community of business analysts and data scientists. Users can freely upload and download data to and from the site (kaggle.com). Kaggle runs competitions often sponsored by third parties. I encourage you to explore the Kaggle website and join the Kaggle community. Kaggle is a great place for those with an interest in machine learning.