Interview Question
Country: United States
Partitioning of big tables should be done in such a way that most of queries get data from single partition. For example, if most of the time your queries recent data then it makes sense to partition on creation_date column but most of the times it is subjective. If you are talking about MySQL there are restrictions on creating partition which are not part of PK if the table has PK, so beware of that.
In SQL? You index on columns you're likely to query on. If a SELECT query has a WHERE clause checking for column X = a value, and if furthermore only a small portion of all the rows in the table have X equal to that value, then using an index on X can substantially improve the performance of the query. The benefit is then reaped everywhere you have a SELECT with a "WHERE X = some value" clause.
- eugene.yarovoi April 19, 2012The reason that having an index helps is that when you have an index, something like a hashmap is used, mapping from values of the indexed column to data rows. When you want to find records with a particular value in the indexed column, you use the value as a key to locate a list of records matching that value. However, using an index means extra time spent maintaining the index when data is added to or removed from the database. The indexes could also use a substantial amount of space. So it's a tradeoff. That's why implementations of SQL make you ask for an index instead of putting one in automatically.