Question 1
Bike-sharing has gained popularity in many cities since the last few decades. It can be seen as a means to promote a healthy lifestyle by involving more physical activity benefits and to mitigate environmental problems such as air pollution and traffic congestion. Yet, managing public bikes for sharing is not easy. There are many factors (e.g. socio-demographic, weather, land use) that affect user demand. In practice, a lot of challenges have to be overcome by bike-sharing companies in order to make the bike-sharing business sustainable.
a. Identify one (1) potential business problem faced by bike-sharing companies and describe how this problem would bring negative impacts to the society, if unsolved.
b. Based on the problem identified in (a), give an example on
(i)how descriptive data mining can be used to solve the problem
(ii)how predictive data mining can be used to solve the problem.
b. Assume that a dataset is collected from a company that offers docked bike-sharing service. It records the number of bikes rented per hour each day and the weather conditions of the day. The variables in the dataset are described in Table 1.
Table 1. Description of the variables collected
Variable | Description |
Date | Calendar date in YYYY-MMM-DD |
Hour | Hour of the day (e.g. the value of 0 means 12am, the |
value of 1 means 1 pm, etc) | |
Temperature | The temperature of the day (in Celsius) |
Humidity | Humidity % of the day |
Wind speed | Wind speed of the day (in m/s) |
Rainfall | Rainfall of the day (in mm) |
Snowfall | Snowfall if the day (in cm) |
Holiday | Whether it is a holiday on the day (Yes / No) |
Season | Season of the day (Winter, Spring, Summer, Autumn) |
Rented bike count | Number of bikes rented at each hour on the day |
c. Suggest two (2) additional variables that could be useful in analysing the demand for bike-sharing. Explain briefly.
d. Assume that there are two interesting findings obtained in data exploration: The demand is lower in winter than in summer; and
The demand is higher on weekdays than on weekends.
Based on the findings, propose two (2) implications for improving bike-sharing service.
e. Identify two (2) limitations of the analysis of user demand using the dataset (i.e. as described in Table 1) in the bike-sharing context.