Finally, we need to know what data our system is actually going to store. This could be in a Pandas dataframe, a dictionary, or just whatever format it comes in from your API. We need to know this before we start programming because we need a sense of what rules the data will need to follow.
Otherwise, your shall we say... less than tech savvy users will cause error after error after error. Never underestimate an end user - they can and will break the rules!
So, let's be specific with the data types and validation so we know how to stop this.
Example
Here's a data dictionary for storing rock data in the Rock Collection Program: