Data management contains multiple steps, including data cleaning and exploratory analysis. In this project, you will showcase skill in data management using Pandas.
Data
You will use publicly available files. The first contains data on causes of death, while the second contains population data. Both files have state-level information for multiple years.
NCHS_-_Leading_Causes_of_Death__United_States
nst-est2018-01
Requirements
To demonstrate pandas skills and ability, answer these questions:
Are Americans facing increasing, decreasing, or steady likelihood of death?
What are the four leading causes of death for Americans?
Do individual states show the same four leading causes of death?
Are there year-by-year changes in the four leading causes of death nationwide?
Use appropriately constructed and formatted tables to show results. There is no need to use visualization in this project.
Use population data appropriately to demonstrate your understanding of how variables are normalized/standardized.
Show skill in constructing a formal report using Jupyter.
Your formal report should contain components such as:
An introduction that discusses the scope of the analysis
A description of data used in the analysis along with data cleaning procedures
Code that clearly shows how an algorithm is implemented
Results
Discussion of results and generation of insight when appropriate
Summary when appropriate