Squirrels in Central Park¶

Data from the 2018 Central Park Squirrel Census is used for this example. The Squirrel Census is a storytelling project focused on the Eastern gray squirrel and they count squirrels and present their findings.

The table contains information for 3,023 sightings and provides information for the location of the sighting, both the longitutude and latitude, hectare, the timing of the sighting (morning or late afternoon and the date), aspects like age, fur color and current activity etc. Hope you like stories about our furry freinds :)

Source for data description: https://github.com/rfordatascience/tidytuesday/tree/master/data/2019/2019-10-29

Source for data: https://data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/vfnx-vebw

squirrel-958377_1280.jpg

Exploratory analysis¶

Missing values¶

Before we start going through the data, it is important to look at the number of missing values in the columns.

Since the columns of Highlight Fur Color, Color Notes, Specific Location, Other Activities and Other Interactions have a lot of missing values (>1000), we will drop them. We will also drop Lat/Long since the data is repeated in columns X and Y.

Since the other columns have a very small number of missing values (less than 200), we will retain them.

Its also important to look at the data types of the columns to ensure that all columns are in their expected data types, this will also help to see if there are any errors in data entry.

X                                             float64
Y                                             float64
Unique Squirrel ID                             object
Hectare                                        object
Shift                                          object
Date                                            int64
Hectare Squirrel Number                         int64
Age                                            object
Primary Fur Color                              object
Combination of Primary and Highlight Color     object
Location                                       object
Above Ground Sighter Measurement               object
Running                                          bool
Chasing                                          bool
Climbing                                         bool
Eating                                           bool
Foraging                                         bool
Kuks                                             bool
Quaas                                            bool
Moans                                            bool
Tail flags                                       bool
Tail twitches                                    bool
Approaches                                       bool
Indifferent                                      bool
Runs from                                        bool
dtype: object

It would be interesting to see the number of unique entries in each column too so as to decide how one wants to use them.

X                                             3023
Y                                             3023
Unique Squirrel ID                            3018
Hectare                                        339
Shift                                            2
Date                                            11
Hectare Squirrel Number                         23
Age                                              3
Primary Fur Color                                3
Combination of Primary and Highlight Color      22
Location                                         2
Above Ground Sighter Measurement                41
Running                                          2
Chasing                                          2
Climbing                                         2
Eating                                           2
Foraging                                         2
Kuks                                             2
Quaas                                            2
Moans                                            2
Tail flags                                       2
Tail twitches                                    2
Approaches                                       2
Indifferent                                      2
Runs from                                        2
dtype: int64

Squirrel Color and Time of day¶

To start, we will look at the distribution of squirrel sightings based on the two times of day (shift), that is, morning (AM) or afternoon (PM) and also the three colors, Black, Cinnamon and Gray. The most number of squirrels sightings were of gray squirrels followed by cinnamon and then black. Moreover, a large difference wasnt observed in the sightings between the morning and afternoon times. Guess gray isn't as gloomy as rain clouds make it seem.

image-2.png

Squirrel Color variations¶

Another neat thing that was pointed out in the dataset were the variation in the colors of squirrel fur, in this section we look at the different color variations for the three main colors of Black, Cinnamon and Gray.

Squirrel vocal sounds¶

Squirrels can make several different vocal sounds, these are Kuks, Quaas and Moans The squirrels were found to make three different sounds, moans, quaas and kuks. Kuks were the most common sound noted with very few squirrels heard moaning.

Source for sounds: Squirrel Alarm Calls Are Surprisingly Complex

Squirrel activities¶

Also, in terms of activities, the squirrels were seen to running, climbing and chasing with running and climbing forming the majority of the activities noted with chasing coming in a distant third.

Squirrel reactions to humans¶

One neat thing to find out would be how friendly the squirrels were in central park. In order to find this out, one can use the columns of Indifferent,Approaches and Runs from which document whether the squirrel was indifferent to human presence, approaching them or ran away. Most of the squirrel interactions had them being indifferent to human presence.Some squirrels ran away whereas a small number approached humans when they saw them.

image.png

Squirrel sightings by hectare¶

In order to visualize the location of the sightings, the total area of central park was divided in to a grid of hectares in the dataset. We visualize the number of squirrels found in each cell of the hectare grid.

image.png

Another interesting thing to look at is to determine which hectare was dominated by squirrel of which color, this is explored here.

image.png

This ends the furry analysis on Central Park squirrels. I will return with some other interesting dataset to look in to :)