COMS W4111: Introduction to Databases
Spring 2024, Sections 002/V02
Homework 1
Introduction to Core Concepts, ER Modeling, Relational Algebra, SQL
Introduction
This notebook contains Homework 1. Both Programming and Nonprogramming tracks should complete this homework.
Submission Instructions
You will submit PDF and ZIP files for this assignment. Gradescope will have two separate assignments for these.
For the PDF:
The most reliable way to save as PDF is to go to your browser's menu bar and click File -> Print . Switch the orientation to landscape mode, and hit save.
MAKE SURE ALL YOUR WORK (CODE AND SCREENSHOTS) IS VISIBLE ON THE PDF. YOU WILL NOT GET CREDIT IF ANYTHING IS CUT OFF. Reach out for troubleshooting.
MAKE SURE YOU DON'T SUBMIT A SINGLE PAGE PDF. Your PDF should have multiple pages.
For the ZIP:
Zip a folder containing this notebook and any screenshots.
You may delete any unnecessary files, such as caches.
Add Student Information
In [ ]:
# Print your name, uni, and track below name = "Donald Ferguson" uni = "dff9" track = "Programming Track" print(name) print(uni) print(track) |
Setup
SQL Magic
The sql extension was installed in HW0. Double check that if this cell doesn't work.
In [ ]:
%load_ext sql |
You may need to change the password below.
In [ ]:
%sql mysql+pymysql://root:dbuserdbuser@localhost
In [ ]:
%sql SELECT * FROM db_book.student WHERE ID = 12345
Python Libraries
In [ ]:
from IPython.display import Image import pandas |
Written Questions
Chapter 1 from the recommended textbook Database System Concepts, Seventh Edition (https://codex.cs.yale.edu/avi/db-book/) covers general information and concepts about databases and database management systems. Lecturing on the general and background information is not a good use of precious class time. To be more efficient with class time, the chapter 1 information is a reading assignment.
(https://codex.cs.yale.edu/avi/db-book/slides-dir/index.html) provided by the textbook authors provide the necessary information. In some cases, students may also have to search the web or other sources to “read” the necessary information.
When answering the written questions, do not “bloviate”. The quantity of words does not correlate with the quality of the answer. We will deduct points if you are not succinct. The answers to the questions require less than five sentences or bullet points.
“If you can't explain something in a few words, try fewer.”
You may use external resources, but you should cite your sources.
W1
What is a database management system and how do relational databases organizedata?
W2
Columbia University uses several applications that use databases to run the university. Examples are SSOL and CourseWorks. An alternate approach could be letting students, faculty, administrators, etc. use shared Google Sheets to create, retrieve, update, and delete information. What are some problems with the
shared spreadsheet approach and what functions do DMBS implement to solve the problems?
W3
Explain the differences between SQL, MySQL Server and DataGrip.
W4
Crow’s Foot Notation has four endings for relationship lines. Briefly explain the meaning of each ending.
W5
What is a primary key and why is it important?
W6
The relational algebra is closed under the operators. Explain what this means and give an example.
W7
Some of the Columbia University databases/applications represent the year/semester attribute of a section in the form "2023_2". The first four characters are the academic year, and the last character is the semester (1, 2, or 3). The data type for this attribute might be CHAR(6). Using this example, explain theconcepts of domain and atomic domain. How is domain different from type?
W8
Briefly explain the difference between a database schema and database instance.
W9
Briefly explain the concepts of data definition language and data manipulation language.
W10
What is physical data independence?
Entity-Relationship Modeling
Overview
The ability to understand a general description of a requested data model and to transform into a more precise, specified logical model is one of the most important skills for using databases. SW and data engineers build applications and data models for end-users. The end-users, product managers and business managers are not SW or data modeling experts. They will express their intent in imprecise, text and words.
The users and business stakeholder often can understand and interact using a conceptual model but details like keys, foreign keys, ... are outside their scope.
In this problem, you will:
Understand a short written description of a requested data model.
Produce a conceptual data model diagram using Lucidchart.
Produce a logical data model diagram using Lucidchart.
You can sign up for a free Lucidchart account. (https://www.lucidchart.com/pages/landing) The free account provides the capabilities you will need for this course.
To draw the diagrams, you need to add the entity relationship shapes. Lecture 2 demonstrated how to add the shapes.
Adding Entity Relationship Shapes
We provide a simple Lucidchart document (https://lucid.app/lucidchart/828777b1-7b2d-4828-bedb-37b6d456c33e/edit?invitationId=inv_a142899a-7e60-44e9- b18e-335d7c9767fc) from Lecture 2 that helps you get started. You need a Lucidchart account to access the document and diagrams.
Data Model Description
The data model represents banks, customers. employees and accouts. The model has the following entity types/sets:
1. Customer
2. Employee of the banking company
3. Branch, which is a location of one of the banks offices
4. Savings Account
5. Checking Account
6. Loan
7. Portfolio
Customer has the following properties:
customerID
lastName
firstName
dateOfBirth
Employee has the following properties:
employeeID
lastName
firstName
jobTitle
Branch has the following properties:
branchID
zipCode
Savings Account has the following properties:
accountID
balance
interestRate
Checking Account has the following properties:
accountID
balance
Loan has the following properties.
loanID
balance
interestRate
Portfolio has the following properties:
portfolioID
createdDate
The data model has the following relationships:
Customer Branch connects a customer and a branch. A Customer is connected to exactly one Branch. A Branch may have 0, 1 or many customers.
Employee Branch connects an employee and a branch. An Employee is connected to exactly one Branch. A Branch may have 0, 1 or many associated employees.
Savings Account Branch , Checking Account Branch_, and__Loan Branch all have the same pattern.
An account/loan has exactly one branch.
A Branch many have 0, 1 or many accounts/loans.
Savings Customer , Checking Customer , Loan Customer_, and__Portfolio Customer follow the same pattern.
The account/loan has exactly one customer.
The customer may have 0 or 1 of each type of account.
A Portfolio is related to exactly one Customer_, exactly one _Savings Account_, exactly one _Checking Account_, and exactly one _Loan.
Portfolio Advisor relates a Portfolio and Employee. An Employee may be the advisor for 0, 1 or many Portfolios. A Portfolio may have at most one Employee advisor.