70 Database Design from Spreadsheets
Spreadsheet usage in extensive in the businesses due to their versatility, accessibility, and ease of use. Many businesses collect and store data in spreadsheets initially.
See recent example of formula 1 excel usage. The formula 1 team used excel for their 20000 part inventory, like below image shows their system become very complex.
show me all excel files you are using since plants (factories) fill the gaps with the excel since their system cannot produce these reports. these reports show the gaps.
From my experience working with implementing ERP systems, reproducing spreadsheets in the database is very important. I also always ask if they use any excel files.
But problems with the spreadsheets are following:
- Starting with spreadsheets are easy but they may become very complex.
- Collaboration is easy with online versions of spreadsheets but role based security is not possible
Many businesses later decide to to migrate and manage this spreadsheet data in a relational database. Thus, transforming spreadsheets to databases is valuable across various roles, including requirement engineer, data analysis, database administration, and software development.
Transforming spreadsheets to relational database offers following benefits:
- Spreadsheets are often unstructured. Transforming them to normalized tables impose logical structure, making data analysis more easier.
- Database tables enforce integrity with their data types and constraints. This reduces error in data entry and calculations
- Data retrieval performance and scalability for multi users are always better. Especially for large data, relational databases are always better.
- Data Security and access control is better with databases
- SQL language offer very complex queries for reporting purposes
- Applications for mobile, web and desktop are more easier
- Dynamic relationships are more easier to handle. It is harder to connect multiple spreadsheets but it is more easier to connect multiple tables even multiple databases.
- Having one single source of truth is possible with databases while it is very hard to do so in spreadsheets. It is very common to have multiple versions of spreadsheets.
70.1 Examples
70.1.1 Student Advisors
STU_NUM | STU_NAME | STU_MAJOR | DEPT_CODE | DEPT_NAME | DEPT_PHONE | COLLEGE_NAME | ADVISOR_NAME | ADVISOR_OFFICE | ADVISOR_BLDG | ADVISOR_PHONE | STU_GPA | STU_CLASS |
---|---|---|---|---|---|---|---|---|---|---|---|---|
211343 | Ahmet | Computer Enginering | CENG | Computer Enginering | 4356 | Enginering | Ender | T201 | T Building | 2115 | 3.87 | Junior |
200128 | Smith | Computer Eng | CENG | Computer Enginering | 4356 | Enginering | Ender | T201 | T Building | 2115 | 2.78 | Sophomore |
199876 | Ayşe | Marketing | MKT | Marketing | 4378 | Business Admin | Atilla | T228 | T Building | 2123 | 2.31 | Senior |
198648 | Asya | Marketing | MKT | Marketing | 4378 | Business Admin | Fatih | T356 | T Building | 2159 | 3.45 | Senior |
223456 | Seda | Statistics | MATH | Mathematics | 3420 | Sciences | Hakan | J331 | A Building | 3209 | 3.58 | Junior |
70.1.2 Office Equipment
A company is tracking its office equipment like
- furniture
- computers
- printers
- and other office equipment
Some sample values are given below.
ITEM_ID | ITEM_LABEL | ROOM_NUMBER | BUILDING_CODE | BUILDING_NAME | BUILDING_MANAGER |
---|---|---|---|---|---|
1 | Desk | 301 | BLDG1 | Main Building | John Smith |
2 | Chair | 301 | BLDG1 | Main Building | John Smith |
3 | HP DeskJet 895Cse | 205 | BLDG2 | Annex | Jane Doe |
4 | Whiteboard | 205 | BLDG2 | Annex | Jane Doe |
5 | HP Pavilion Laptop | 102 | BLDG1 | Main Building | John Smith |
Company would like to answer following questions in their reports
- How many laptops are assigned to employees?
- Do we have more than one printer in the office rooms?
- Do we have at least one printer in every floor?
Please design a new database for their use cases at least in 3rd normal form.
70.1.3 Student Classes
Student_ID | Name | Class information | Course_No | Course name | Grades |
---|---|---|---|---|---|
1 | Ayşe | 12A, Room 123, class teacher: Mesut | 2 | Programming | 80 |
2 | Duru | 11C, Room 115, class teacher: Atilla | 3 | Databases | 70 |
3 | Aydın | 11B, Room 210, class teacher: Ender | 1, 2, 3 | Web Apps, Programming, Databases | 90, 60, 74 |
4 | Ahmet | 12A, Room 123, class teacher: Mesut | 1, 2 | Programming, Web Apps | 67, 50 |
5 | Funda | 11C, Room 115, class teacher: Atilla | 1 | Web Apps | 72 |
70.1.4 Developers and Projects
Personel Number | FirstName | LastName | Title | ProjectNumber | ProjectRole | Hours in Project |
---|---|---|---|---|---|---|
12345 John | Doe | Manager | P001 | Project Lead | 40 | |
54321 | Jane | Smith | Analyst | P001,P002 | Data Analyst,Data Analyst | 35,10 |
67890 | David | Lee | Developer | P001,P003 | Developer,Data Analyst | 30, 60 |
98765 | Sarah | Jones | Tester | P003 | Tester | 32 |
13579 | Michael | Brown | Architect | P002 | Solution Architect | 45 |
24680 | Emily | Davis | Designer | P001 | UX Designer | 38 |
15975 | Daniel | Miller | Analyst | P003 | Business Analyst | 30 |
35791 | Olivia | Wilson | Developer | P002 | Senior Developer | 40 |
57913 | James | Moore | Tester | P001,P002,P003 | QA Lead,QA,QA Lead | 35,10,20 |
79135 | Chloe | Taylor | Architect | P003 | Enterprise Architect | 50 |
70.1.5 Employee information
EMP_NUM | EMP_FIRST_NAME | EMP_LAST_NAME | EMP_EDUCATION | JOB_CLASS | EMP_DEPENDENTS | DEPT_CODE | DEPT_NAME | DEPT_MANAGER | EMP_TITLE | EMP_DOB | EMP_HIRE_DATE | EMP_SKILLS | EMP_SALARY |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1001 | John | Doe | Bachelor’s | Engineer | Jane (spouse),Mary (daughter), Jill (daughter) | IT | Information Technology | Jane Smith | Software Engineer | 1990-01-15 | 2020-05-10 | Java, Python, SQL | 75000 |
1002 | Jane | Smith | Master’s | Manager | HR | Human Resources | David Lee | HR Manager | 1985-08-20 | 2018-11-22 | Leadership, Communication | 90000 | |
1003 | David | Lee | Bachelor’s | Analyst | Finance | Finance | Michael Brown | Financial Analyst | 1992-03-12 | 2021-07-05 | Excel, Tableau | 60000 | |
1004 | Sarah | Jones | PhD | Scientist | John (spouse) | Research | Research & Development | Emily Davis | Research Scientist | 1988-11-05 | 2015-02-18 | Biology, Chemistry | 85000 |
1005 | Michael | Brown | Master’s | Manager | George (son) | Sales | Sales & Marketing | Daniel Miller | Sales Manager | 1980-05-25 | 2012-09-10 | Sales, Negotiation | 100000 |
70.1.6 Consultant
CLIENT_NUM | CLIENT_NAME | CLIENT_CITY | CONTRACT_DATE | CONTRACT_NUMBER | CONTRACT_AMOUNT | CONSULT_CLASS_1 | CONSULT_CLASS_2 | CONSULT_CLASS_3 | CONSULT_CLASS_4 | CONSULTANT_NUM_1 | CONSULTANT_NAME_1 | CONSULTANT_CITY_1 | CONSULTANT_NUM_2 | CONSULTANT_NAME_2 | CONSULTANT_CITY_2 | CONSULTANT_NUM_3 | CONSULTANT_NAME_3 | CONSULTANT_CITY_3 | CONSULTANT_NUM_4 | CONSULTANT_NAME_4 | CONSULTANT_CITY_4 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C001 | Acme Corp | New York | 2024-07-05 | C12345 | 50000 | Strategy | Marketing | Finance | IT | 101 | John Doe | New York | 102 | Jane Smith | London | 103 | David Lee | Tokyo | 104 | Sarah Jones | Paris |
C002 | Global Tech | London | 2023-11-10 | C23456 | 75000 | Technology | Operations | Legal | HR | 105 | Michael Brown | Berlin | 106 | Emily Davis | Sydney | 107 | Daniel Miller | Rome | 108 | Olivia Wilson | Madrid |
C003 | Innovate Inc | Paris | 2024-03-20 | C34567 | 30000 | Marketing | Sales | Finance | Legal | 109 | James Moore | Toronto | 110 | Chloe Taylor | Dubai | 111 | Alex Lee | Seoul | 112 | Ben Walker | Mumbai |
C004 | Data Solutions | Tokyo | 2022-09-08 | C45678 | 45000 | Data Analysis | AI | Cloud | Cybersecurity | 113 | Lisa Wong | Shanghai | 114 | Mark Davis | Singapore | 115 | Nancy Chen | Beijing | 116 | Peter Jones | Hong Kong |
C005 | Green Energy | Sydney | 2024-05-15 | C56789 | 60000 | Sustainability | Renewable Energy | Energy Storage | Grid Modernization | 117 | Robert Smith | New Delhi | 118 | Susan Lee | Rio de Janeiro | 119 | Thomas Brown | Cape Town | 120 | Victoria Jones | Mexico City |