Coursework designed to help understand further concepts and demonstrate practical skills related to the Hadoop environment.
Coursework tasks and Problem statement 1. Identify and evaluate a number of publicly available dataset s related to air pollution and severity of respiratory disease. These may be from sources such as kaggle.com or data.gov.uk 2. Select appropriate datasets, as informed by their interests 3. Integrate and import these datasets into a suitable data storage and processing system, providing rationale for their choice 4. Perform meaningful analysis of the data to derive some simple useful information, as can be obtained by the dataset selected. 5. Provide visualisation of the analysis through any Hadoop-related technologies which the students deem suitable. "If you need the complete report and codes, please leave a comment on this blog. Few screenshots, steps, and codes are not included assuming that you are able to figure them out. If you are having difficulty getting the output, then please leave a comment and I will help you with that." Sample Coursework for the above t...
Comments
Post a Comment