Lesson 9. Working with activity data: what the animal was doing there?
Once acceleration data are stored into the database, they are ready to be analyzed. Acceleration data can provide valuable information and can help to create the so called semantic trajectories, i.e. description of the movement that identify specific actions of animals while they interact with the surrounding environment. In this lesson you will see some example of exploratory analysis.
Topic 1. Relate activity and GPS data
You can now join activity and GPS data. In our activity data set, activity data are recorded with a frequency of five minutes, thus much more often than GPS positions (with a frequency of four hours). Hence, you need to combine the two data sets in a way that makes sense for the specific scientific question to be asked.
The most simple approach to relate activity and GPS data is to sub-sample the activity data to the same temporal resolution as the GPS positions. This can be done by a simple JOIN between the GPS data and the activity data, where you combine the data from the two sensors on the same animal at the same time. The result is that you associate the activity data record (what the animal is doing) to each GPS position (where the animal is doing it). You look for the activity data that is in an interval of 5 minutes (2 minutes and 30 seconds after and before) from the GPS position acquisition time. The option LIMIT 5 limits the result to the first 5 records. You can see that with large amounts of data, the processing time can increase considerably:
SELECT gps_data_animals.animals_id AS id, gps_data_animals.acquisition_time AS acquisition_time_gps, activity_data_animals.acquisition_time::time AS time_act, ST_X(gps_data_animals.geom)::numeric(7,5) AS gps_x, ST_Y(gps_data_animals.geom)::numeric(7,5) AS gps_y, activity_data_animals.activity_x AS act_x, activity_data_animals.activity_y AS act_y FROM main.gps_data_animals INNER JOIN main.activity_data_animals ON gps_data_animals.animals_id = activity_data_animals.animals_id AND ((gps_data_animals.acquisition_time - activity_data_animals.acquisition_time) < interval '150 second') AND ((gps_data_animals.acquisition_time - activity_data_animals.acquisition_time) > interval '-150 second') AND gps_validity_code = 1 LIMIT 5
The result is
In the next query, you retrieve the average and standard deviation value for activity in the different land cover types. In this way, you can see whether the data indicate an obvious effect of the environment on the activity of the animal:
SELECT label3 AS land_cover, avg(activity_data_animals.activity_x)::numeric(5,2) AS avg_x, stddev(activity_data_animals.activity_x)::numeric(5,2) AS stddev_x, avg(activity_data_animals.activity_y)::numeric(5,2) AS avg_y, stddev(activity_data_animals.activity_y)::numeric(5,2) AS stddev_y, count(label3) AS num FROM main.gps_data_animals, main.activity_data_animals, env_data.corine_land_cover_legend WHERE gps_data_animals.animals_id = activity_data_animals.animals_id AND ((gps_data_animals.acquisition_time - activity_data_animals.acquisition_time) < interval '150 second') AND ((gps_data_animals.acquisition_time - activity_data_animals.acquisition_time) > interval '-150 second') AND gps_validity_code = 1 AND gps_data_animals.corine_land_cover_code = corine_land_cover_legend.grid_code GROUP BY label3;
The result reported is
|Sparsely vegetated areas||13.82||12.44||29.44||23.44||101|
As you can see, the highest activity rate is recorded in pasture lands, which might lead to interesting considerations on the behaviour of the animal.
In the last example, you retrieve the average activity value of the 6 activity records that are closest in time to each GPS position, which means 15 minutes after and before.
SELECT gps_data_animals.gps_data_animals_id AS gps_id, gps_data_animals.animals_id AS animal, gps_data_animals.acquisition_time, avg(activity_data_animals.activity_x)::numeric(5,2) AS avg_act_x, avg(activity_data_animals.activity_y)::numeric(5,2) AS avg_act_y FROM main.gps_data_animals INNER JOIN main.activity_data_animals ON gps_data_animals.animals_id = activity_data_animals.animals_id AND ((gps_data_animals.acquisition_time - activity_data_animals.acquisition_time) < interval '15 minute') AND ((gps_data_animals.acquisition_time - activity_data_animals.acquisition_time) > interval '-15 minute') AND gps_validity_code = 1 GROUP BY gps_data_animals.gps_data_animals_id, gps_data_animals.animals_id, gps_data_animals.acquisition_time ORDER BY gps_data_animals.acquisition_time LIMIT 5;
The result is
- Retrieve the centroid of locations for animal 3 for each month together with the average monthly value of act_x and act_y
Topic 2. Identification of animal's activities from activity data
The main goal of activity sensors id to provide data that can help to decipher what an animal was doing. Transforming activity data into animal's activity is a complex process and it is still a field of research. In this lesson we explore simple approaches.
Assuming that the value '0' correspond to a very low activity (e.g. resting), you can calculate the percentage of records with that value per each hour of the day (regardless of the animal) during June. You can use the CASE SQL command to transform the activity data into 1 if the original value is 0 and into 0 if the original value was greater than 0. Then you can count the total number of records and make a sum to know the percentage of 'inactive' records.
SELECT animals_id, EXTRACT(HOUR FROM acquisition_time) AS hour_day, sum(CASE WHEN activity_x = 0 THEN 1 ELSE 0 END)*100/count(*) AS perc_inactivity_x, sum(CASE WHEN activity_y = 0 THEN 1 ELSE 0 END)*100/count(*) AS perc_inactivity_y FROM main.activity_data_animals WHERE EXTRACT(MONTH FROM acquisition_time) = 6 GROUP BY animals_id, EXTRACT(HOUR FROM acquisition_time) ORDER BY EXTRACT(HOUR FROM acquisition_time);
The result is
- What is the percentage of inactive records (i.e. activity = 0) in December? Is any clear season pattern (compare with percentage of inactive records in June)?
Summary exercise of Lesson 9
- Is there any daily pattern in the average activity value (compare percentage of inactive records per hour of the day)?
- Associate GPS location with activity data that are inside a time interval of 2 hours after and before the GPS acquisition time and calculate the percentage of inactive activity record per GPS location. How many locations have a percentage of inactivity greater than 75%