Data Dictionary

Finally, we need to know what data our system is actually going to store. This could be in a Pandas dataframe, a dictionary, or just whatever format it comes in from your API. We need to know this before we start programming because we need a sense of what rules the data will need to follow.

Otherwise, your shall we say... less than tech savvy users will cause error after error after error. Never underestimate an end user - they can and will break the rules!

So, let's be specific with the data types and validation so we know how to stop this.

Example

Here's a data dictionary for storing rock data in the Rock Collection Program:

Variable
Data Type
Format for Display
Size in Bytes
Size for Display
Description
Example
Validation

name

String

Text

50

50

The name of the rock/mineral

"Granite"

Must be a non-empty string

hardness

Integer

Whole number

4

2

hardness scale value

6

1 ≤ hardness ≤ 10

composition

String

Text

100

100

Main mineral components

"Quartz, Feldspar"

Must be a valid string

colour

String

Text

20

20

Common rock colors

"Grey"

Must be a valid colour name

location_found

String

Text

20

100

Place where the rock was found

"Yosemite, USA"

Must be a valid location

Need More Information?

Data Dictionaries

Last updated