This first week felt like a clean reset. I went through the zyBooks SQL basics and got IntelliJ running again for the Java pieces. I finished Labs 1, 3, 4, and 8, plus Homework 1. I actually enjoyed Homework 1 a lot—I’m excited to get more comfortable with SQL because I’ll use it at work, and since I want to move toward a data scientist role, getting past “Excel-only” and into real databases matters. I still love Excel for quick analysis, but SQL gives me reach and structure.
1. Relational database tables and spreadsheets look similar with both having rows and columns. What are some important differences between the two?
On the surface they look the same—rows and columns—but they behave very differently. In a spreadsheet I can type almost anything into any cell; it’s flexible, but that also means it’s easy to break things. In a relational database I have a schema: real data types, primary keys, foreign keys, and constraints that stop bad data from getting in. That structure is the point. Databases also handle relationships across tables without copy-pasting—joins beat “VLOOKUP gymnastics.” They scale better, too. If I try to mash 300k rows into a spreadsheet with a bunch of formulas, it crawls. A database can index those columns and answer the same question quickly. Finally, databases are built for multiple people at once and keep data consistent with transactions; a spreadsheet shared by email or even online can still get out of sync or overwritten.
2. Installing and configuration a database and learning how to use it is more complicated that just reading and writing data to a file. What are some important reasons that makes a database a useful investment of time?
Setting up a database takes more time than reading and writing a CSV, but it pays off almost immediately:
I can change the question without rewriting the whole pipeline—just write a new query.
Indexes make “where” and “join” operations fast as data grows.
Transactions protect me from half-finished writes and weird race conditions.
Constraints keep the data clean so I don’t spend my life fixing typos later.
Backups and recovery exist, which matters the first time someone deletes the wrong thing.
Permissions let me share access safely, instead of passing around mystery copies of a file.
A small, real example from this week: I mixed up an ORDER BY and a GROUP BY in one of the labs and got nonsense results. With a file, I’d probably start over in a notebook and hope I didn’t break something else. In SQL, I just fixed the query, re-ran it, and the underlying data stayed reliable.
3. What do you want to learn in this course that you think will be useful in your future career?
Short term, I want to be fluent with the “everyday” SQL I’ll need at work: solid joins (inner/left/right), grouping with HAVING, window functions for running totals and rankings, and how to read and speed up slow queries. I also want to get better at data modeling—how to design tables that make analysis easier instead of fighting it later. On the Java side, I want practice connecting code to a database the right way (prepared statements, connection pooling) so that what I write is safe and not brittle.
Big picture, I’m aiming at data science or cybersecurity. I can already get a lot done in Excel, but pairing that with strong SQL will let me pull exactly the data I need, keep it clean, and hand off results that other people can trust. This week was a good start: quick wins in the labs, a few small mistakes (I forgot a semicolon once and chased a typo in a WHERE clause), and a reminder that learning the disciplined way—schemas, keys, and constraints—will save me time when the datasets get bigger and the questions get messier.
No comments:
Post a Comment