The data science project starts with the identification of various data sources, which may include web server logs, social media posts, data from digital libraries and databases, data accessed through sources on the internet via APIs, web scraping, or information that is already present in an excel spreadsheet.

Then there is EDA and other stages such as ML Model building, NLP, CV and Predictive Analytics.

Levels

Practice here

Data CTF

Get better with the fundamentals via our cool & rewarding useful data exercises.

Challenge 19

You're given NASA's meteorite landings dataset (meteorites.csv) with intentional anomalies. Your task is to:

  1. Clean the data
  2. Find hidden patterns
  3. Locate coordinates of an ancient meteorite impact site


```csv
name,mass (g),year,reclat,reclong
"Allegheny", 453,"1999-01-01",34.05,-118.24
"Chinguetti", 1220,"2005",-invalid-,16.88
"Tatahouine", NaN,"1933-06-27",33.79,"10:85"
"Brahin","abc","1998",52.58,-113.49
"Canyon Diablo", 1000,,35.03,-111.02
```

Challenge 20

A smart factory's temperature sensors suddenly reported erratic readings. Analyze 30 days of temporal data from sensor_readings.csv to:

  1. Identify corrupted sensors
  2. Find exact timestamp of first anomaly
  3. Determine which sensor was compromised first
```csv
timestamp,sensor_id,temp_C
"2023-07-01 00:00", S01,25.3
"July 1 2023 00:15", S02,"N/A"
2023-07-01 00:30,S03,32.1
2023-07-01 00:30,S03,32.1 # Duplicate
"2023-07-01 00:45", S04,-273
...
```

Data Collection CTF

Get better with the fundamentals via our cool & rewarding useful data exercises.

Challenge

Not Available Yet! Coming Soon.

insert_chart