Advanced Tasks¶
Finally, let’s consider more intricate questions about our data set.
Tasks¶
- Extrapolate daily statistics for from the hourly ones.
- The columns Temperature, Dew Point, Wind Speed and Pressure should contain the mean value over the days’ 24 hours
- The columns for 1 hour and 6 hour Precipitation should hold the sum of the days’ 24 hours
- The columns for 1 hour and 6 hour Trace Precipitation should be
True
only if any of the 24 hours contained aTrue
entry already. - Create a new DataFrame to store the daily data. It only should contain the columns mentioned above and use the day as index.
- Hint: The
DataFrame.groupby(…)
-method could be of great help here. - Hint: Our index is built from
Timestamp
-objects, which have a very handyfloor(…)
-method which can be used to reduce an hourly timestamp to its day while remaining compatible with timestamps produced byoandas.date_range(…)
.
- (Optional) Come up with a metric for a “nice day” and an “awful day”. What were the most nice or awful days in your location?
- (Optional, Higher difficulty!) Find out what the “average” wind direction in your location is.
- Note that wind directions can be rather fuzzy values, so you might want to come with a better metric here than simply calculating the most common value.
- Wind directions are given in increments of 10°.
- The wind direction wraps around after
350
to0
after the adaptations we made when cleaning the data. - The approach needed here is called a circular mean, which is somewhat tricky to understand and compute. But maybe someone knows a good approach…
Hints for Solving the Task
If you are seriously stuck, you can take a look at the solution hints.