BUS5PA Predictive Analytics

Get Expert's Help on Customer Segmentation, Association Rule Mining and MBA Case Studies

Hire a tutor for this answer

We have Professional Tutors available for all subjects, Unlock Your Potential with Personalized Tutoring

LiveChat / WhatsApp

Our expert tutors are available 24/7 to help you achieve your academic goals

Ask a New Question

We provide personalized tutoring and homework assistance services to help students of all levels succeed.

Part A - Cluster Analysis (35%)

A wholesale supply company sells four types of dungarees - fashion jeans, leisure jeans, stretch jeans and original jeans. The owner of the supply company is interested in identifying the groupings of stores which his products are supplied. In order to identify such groupings, the owner has selected the DUNGAREE data set which gives the number of pairs of four different types of dungarees that were sold at stores over a specific time period.

In the DUNGAREE data set, each row represents an individual store. There are six columns in the data set. One column is the store identification number, and the remaining columns contain the number of pairs of each type of jeans that were sold.

The variables in the data set are shown below with the appropriate roles and levels.

Name Model Role Measurement Level Description
STOREID ID Nominal Identification number of the store
FASHION Input Interval Number of pairs of fashion jeans sold at the store
LEISURE Input Interval Number of pairs of leisure jeans sold at the store
STRETCH Input Interval Number of pairs of stretch jeans sold at the store
ORIGINAL Input Interval Number of pairs of original jeans sold at the store
SALESTOT Rejected Interval Total number of pairs of jeans sold (the sum of FASHIONLEISURESTRETCH, and ORIGINAL)

You, as the data analyst, is required to conduct a cluster analysis for the data set and provide an insightful report to the owner of the wholesale supply company in order for him to take timely actions to grow his revenue.

  1. Create a new diagram in your Name the diagram Jeans.
  2. Define the data set DUNGAREE as a data
  3. Determine whether the model roles and measurement levels assigned to the variables are appropriate.

Examine the distribution of the variables.

  • Are there any unusual data values?
  • Are there missing values that should be replaced?
  1. Assign the variable STOREID the model role ID and the variable SALESTOT the model role Rejected. Make sure that the remaining variables have the Input model role and the Interval measurement level. Why should the variable SALESTOT be rejected?
  2. Add an Input Data Source node to the diagram workspace and select the

DUNGAREE data table as the data source.

  1. Add a Cluster node to the diagram workspace and connect it to the Input Data
  2. Select the Cluster

Leave the default setting as Internal Standardization ð Standardization

What would happen if inputs were not standardized? Explain using knowledge from discussions in the class.

  1. Run the diagram from the Cluster node and examine the

Does the number of clusters created seem reasonable? Discuss using knowledge from class discussions – what is a cluster/how many clusters should you have, etc.

  1. Specify a maximum of six clusters and re-run the Cluster

How does the number and quality of clusters compare to that obtained in part h?

  1. Use the Segment Profile node to summarize the nature of the Describe the profiles.

Part B - Market Basket Analysis and Association Rules (35%)

In order to plan innovative promotions to move items that are often purchased together, a store is interested in market basket analysis of items purchased from the Health and Beauty Aids Department and the Stationary Department. You are a member of the analytics team assigned to the task.

The store chose to conduct a market basket analysis of specific items purchased from these two departments. The TRANSACTIONS data set contains information about approximately 400,000 transactions made over the past three months.

The following products are represented in the data set:

bar soap markers prescription medications
bows pain relievers shampoo
candy bars pencils toothbrushes
deodorant pens toothpaste
greeting cards perfume wrapping paper
magazines photo processing

You have access to SAS Enterprise Miner data analytics tools and decided to carry out a market basket and association rule based analysis of the data. The following instructions will help you to set up the SAS diagram for the analysis.

There are four variables in the data set:

Name ModelRole MeasurementLevel Description
Outlet Rejected Nominal Identification number of the store
PurchaseId ID Nominal Transaction identification number
Item Target Nominal Product purchased
Amount Rejected Interval Quantity of this product purchased


  1. Create a new Name the diagram Retail.
  2. Create a new data source using the data set RETAIL.
  3. Assign the variables Outlet and Amount the model role Rejected. These variables are not used in this Assign the ID model role to the variable PurchaseId and the Target model role to the variable Item. Change the data source role to Transaction.
  4. Add the RETAIL data set and an Association node to the
  5. Change the setting for the Export Rule by ID property to Yes.
  6. Leave the remaining default settings for the Association node and run the

Examine the results of the association analysis. Your team leader has indicated that the answer to the following questions will be useful to the management. You have to answer the questions and prepare a report giving evidence to support your answers – (e.g.: Screen shots, numeric values etc.).

  1. What is the lift value of a rule? What is the importance (practical use) in calculating lift?
  2. What does the highest lift value signify?
  3. Based on the association rules, briefly describe 3 example product bundles and promotions that you might suggest, with justifications
  4. You are required to present the outcomes of the analysis to the managers of Health and Beauty Aids Department and the Stationary Department. Prepare a brief presentation (max. 5 slides) presenting:
    • The problem
    • Your solution/approach
    • Outcomes
    • Analysis results and interpretation

In your slides you should present the approach and outcomes such as support, confidence, lift and how could the product bundles you suggested be used (practical value) by the departments.

Part C – Open Discussion - Analytics Case Study (30%)

Read the following articles related to the Guest Lecture (week 12).

  1. Retail Demand Forecasting (main reading material)
  2. How AI can reshape retail (additional reading)

It is expected that you will attend and interact with the guest lecturer to clarify any doubts or areas which are not clear to you to respond to the following questions.

  1. Why has demand forecasting become essential in the retail sector at present (compared to the past)?
  2. What benefit would demand forecasting provide to the retailer? What are the benefits to the customer?
  3. How does predictive analytics relate to and important in demand forecasting (especially in retail)?

Your responses to each question should be a minimum of 250 words (for each of the 3 questions) but should not be more than 400 words.


Hire Expert Tutors

Get Professional Tutoring at Low Price in Australia


Tutoring Services


Orders Delivered


5 Star Rating


PhD Experts


Amazing Features

Plagiarism Free

Top Quality

Best Price

On-Time Delivery

100% Money Back

24 x 7 Support