CS521 Programming in Python:Google Play Store

$20 Bonus + 25% OFF

Securing Higher Grades Costing Your Pocket?
Book Your Assignment at The Lowest Price
Now!

Students Who Viewed This Also Studied

CS521 Programming in Python

Question:

Google Play Store

Problem Statement:

Google Play Store team is about to launch a new feature wherein, certain apps that are promising, are boosted in visibility. The boost will manifest in multiple ways including higher priority in recommendations sections (“Similar apps”, “You might also like”, “New and updated games”). These will also get a boost in search results visibility.  This feature will help bring more attention to newer apps that have the potential.

Domain: General

Analysis to be done: The problem is to identify the apps that are going to be good for Google to promote. App ratings, which are provided by the customers, is always a great indicator of the goodness of the app. The problem reduces to: predict which apps will have high ratings.

Content: Dataset: Google Play Store data Fields in the data –App: Application name

Category: Category to which the app belongs 

-Rating: Overall user rating of the app

-Reviews: Number of user reviews for the app

-Size: Size of the app

-Installs: Number of user downloads/installs for the app

-Type: Paid or Free

-Price: Price of the app

-Content Rating: Age group the app is targeted at – Children / Mature 21+ / Adult

Genres:

An app can belong to multiple genres (apart from its main category). For example, a musical family game will belong to Music, Game, Family genres.Last Updated: Date when the app was last updated on Play Store Current Ver: Current version of the app available on Play Store Android Ver: Minimum required Android version

Steps to perform:

-Load the data file using pandas. 

-Check for null values in the data. Get the number of null values for each column.

-Drop records with nulls in any of the columns. 

-Variables seem to have incorrect type and inconsistent formatting. You need to fix them: 

-Size column has sizes in Kb as well as Mb. To analyze, you’ll need to convert these to numeric.

-Extract the numeric value from the column

-Multiply the value by 1,000, if size is mentioned in MbReviews is a numeric field that is loaded as a string field. Convert it to numeric (int/float).Installs field is currently stored as string and has values like 1,000,000+. 

Treat 1,000,000+ as 1,000,000 remove ‘+’, ‘,’ from the field, convert it to integer

Price field is a string and has $ symbol. Remove ‘$’ sign, and convert it to numeric.

Sanity checks:

Average rating should be between 1 and 5 as only these values are allowed on the play store. Drop the rows that have a value outside this range.Reviews should not be more than installs as only those who installed can review the app. If there are any such records, drop them.For free apps (type = “Free”), the price should not be >0. Drop any such rows.

Performing univariate analysis: 

Boxplot for Price Are there any outliers? Think about the price of usual apps on Play Store.

Boxplot for Reviews Are there any apps with very high number of reviews? Do the values seem right?

Histogram for Rating

How are the ratings distributed? Is it more toward higher ratings?

Histogram for Size

Note down your observations for the plots made above. Which of these seem to have outliers?

Outlier treatment: 

Price: From the box plot, it seems like there are some apps with very high price. A price of $200 for an application on the Play Store is very high and suspicious!Check out the records with very high price Is 200 indeed a high price? Drop these as most seem to be junk apps

Reviews:

Very few apps have very high number of reviews. These are all star apps that don’t help with the analysis and, in fact, will skew it. Drop records having more than 2 million reviews.

Installs:

 There seems to be some outliers in this field too. Apps having very high number of installs should be dropped from the analysis.Find out the different percentiles – 10, 25, 50, 70, 90, 95, 99 Decide a threshold as cutoff for outlier and drop records having values more than that

Bivariate analysis:

Let’s look at how the available predictors relate to the variable of interest, i.e., our target variable rating. Make scatter plots (for numeric features) and box plots (for character features) to assess the relations between rating and the other features.Make scatter plot/joinplot for Rating vs. Price What pattern do you observe? Does rating increase with price? Make scatter plot/joinplot for Rating vs. Size

Are heavier apps rated better?

Make scatter plot/joinplot for Rating vs. Reviews Does more review mean a better rating always? Make boxplot for Rating vs. Content Rating Is there any difference in the ratings? Are some types liked better? Make boxplot for Ratings vs. Category Which genre has the best ratings? For each of the plots above, note down your observation.

Data preprocessing

For the steps below, create a copy of the dataframe to make all the edits. Name it inp1.Reviews and Install have some values that are still relatively very high. Before building a linear regression model, you need to reduce the skew. Apply log transformation (np.log1p) to Reviews and Installs. Drop columns App, Last Updated, Current Ver, and Android Ver. These variables are not useful for our task.Get dummy columns for Category, Genres, and Content Rating. This needs to be done as the models do not understand categorical data, and all data should be numeric. Dummy encoding is one way to convert character fields to numeric. Name of dataframe should be inp2.

Train test split  and apply 70-30 split. Name the new dataframes df_train and df_test.

Separate the dataframes into X_train, y_train, X_test, and y_test.

Model building

Use linear regression as the technique Report the R2 on the train set Make predictions on test set and report R2.

CS521 Programming in Python

Answer in Detail


Solved by qualified expert

Get Access to This Answer

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

31 More Pages to Come in This Document. Get access to the complete answer.

More CS521 CS521 Programming in Python: Questions & Answers

CZ2001 Algorithms

1: Are the following two graphs the same?Discuss whether the problem above is an NP, NP-hard, NP-complete or P class problem.Q2: As we know, the heuristic or approximation algorithms may not give an optimum solution to the problem but they are polynomial efficient. (a)Propose an approximation algori …

View Answer

KIT107 Java Programming

A stack implemented as a singly-linked list may be defined as shown below on lines 4–17.  The function declared on lines 19–38 creates a new stack (pointed at by rsp) containing all items from the stack s but in reverse order.There are, unfortunately, six lines in the reverse() function with er …

View Answer

ISY2006 – Object Oriented Programming


Write and Execute four programs based on the content of previous lectures.
You need to write and execute programs using visual c#.
 1. Write C# code to declare two integer variables, one float variable, one Boolean variable,  one char variable and one string variable and assign values, …

View Answer

Shape Area Calculation System

description1 …

View Answer

Content Removal Request

If you are the original writer of this content and no longer wish to have your work published on Myassignmenthelp.com then please raise the
content removal request.

Choose Our Best Expert to Help You

Cherish Vega

I have completed my masters in Economics from The University of Manchester

230 – Completed Orders

Hire Me

Phelps Peter

I have acquired my doctorate degree from National University of Singapore.

800 – Completed Orders

Hire Me

Draven Howell

I have completed my Ph.D. in Audit from the London School of Economics.

400 – Completed Orders

Hire Me

Dania Barnes

I have earned my PhD-degree in Human geography from Durham University.

300 – Completed Orders

Hire Me

Still in Two Minds? The Proof is in Numbers!

33845 Genuine Reviews With a Rating of 4.9/5.

English

Essay: 1.6 Pages, Deadline:
11 hours

I like how they help me. They are easy honest and always on time. I recommend them.

User ID: 8***35 United States

Civil Law

Home Work: 12 Pages, Deadline:
15 days

Thank you so much! It was a well-written and organized paper. Great job Thank you so much! Great job,Well written, but hoped for better grade, but i w …

User ID: 3***07 Cork, Ireland

Accounting

Assignment: 1400 Pages, Deadline:
8 days

It was very accurate. The research was done very well and I got nice marks . They really helped me with my work as I was struggling

User ID: 5***57 Sydney, Australia

Business Law

Assignment: 8 Pages, Deadline:
9 days

it was great assignment according to what was important, without mistakes, great

User ID: 8***11 Lincoln, Great Britain

Business Law

Assignment: 6.4 Pages, Deadline:
10 days

The assignment done by professional way and I got more than 85 marks for this subject. Thanks for the whole Nerddz team.

User ID: 2***65 Deepdene, Australia

Psychology

Assignment: 6.8 Pages, Deadline:
10 days

Great work done by the Team Nerddz. I got marks more than 85 for the subject. Thank you…

User ID: 2***65 Deepdene, Australia

Management

Assignment: 6 Pages, Deadline:
4 days

It was nice work and written in a good way. It helped me to increase my final grades for the same course and instructor praised me. Got good grades ov …

User ID: 5***29 Calgary, Canada

Economics

Assignment: 3.5 Pages, Deadline:
2 days

It was nice work and written in a good way. It helped me to increase my final grades for the same course and instructor praised me. 1 star less for im …

User ID: 5***29 Calgary, Canada

Management

Home Work: 3 Pages, Deadline:
1 day

This instructor was really very strict but still gave good marks. Abcdefghijklmno

User ID: 5***29 Calgary, Canada

Management

Thesis: 3 Pages, Deadline:
1 day

Amazing work on this! Got good marks excellent work by the writer. I did not expect so much marks

User ID: 5***29 Calgary, Canada

English

Essay: 1.6 Pages, Deadline:
11 hours

I like how they help me. They are easy honest and always on time. I recommend them.

User ID: 8***35 United States

Civil Law

Home Work: 12 Pages, Deadline:
15 days

Thank you so much! It was a well-written and organized paper. Great job Thank you so much! Great job,Well written, but hoped for better grade, but i w …

User ID: 3***07 Cork, Ireland

Accounting

Assignment: 1400 Pages, Deadline:
8 days

It was very accurate. The research was done very well and I got nice marks . They really helped me with my work as I was struggling

User ID: 5***57 Sydney, Australia

Business Law

Assignment: 8 Pages, Deadline:
9 days

it was great assignment according to what was important, without mistakes, great

User ID: 8***11 Lincoln, Great Britain

Business Law

Assignment: 6.4 Pages, Deadline:
10 days

The assignment done by professional way and I got more than 85 marks for this subject. Thanks for the whole Nerddz team.

User ID: 2***65 Deepdene, Australia

Psychology

Assignment: 6.8 Pages, Deadline:
10 days

Great work done by the Team Nerddz. I got marks more than 85 for the subject. Thank you…

User ID: 2***65 Deepdene, Australia

Management

Assignment: 6 Pages, Deadline:
4 days

It was nice work and written in a good way. It helped me to increase my final grades for the same course and instructor praised me. Got good grades ov …

User ID: 5***29 Calgary, Canada

Economics

Assignment: 3.5 Pages, Deadline:
2 days

It was nice work and written in a good way. It helped me to increase my final grades for the same course and instructor praised me. 1 star less for im …

User ID: 5***29 Calgary, Canada

Management

Home Work: 3 Pages, Deadline:
1 day

This instructor was really very strict but still gave good marks. Abcdefghijklmno

User ID: 5***29 Calgary, Canada

Management

Thesis: 3 Pages, Deadline:
1 day

Amazing work on this! Got good marks excellent work by the writer. I did not expect so much marks

User ID: 5***29 Calgary, Canada

Have any Query?

GET HELP WITH YOUR HOMEWORK PAPERS @ 25% OFF

For faster services, inquiry about  new assignments submission or  follow ups on your assignments please text us/call us on +1 (251) 265-5102

Write My Paper Button

WeCreativez WhatsApp Support
We are here to answer your questions. Ask us anything!
👋 Hi, how can I help?
Scroll to Top