Every so often I get called upon to troubleshoot the performance of some SQL queries (that I didn't design). Frankly I view this activity as both a waste of time and an indication of severe design failure. Yes it /is/ possible to optimize queries post hoc, and if you must work your way through this task I will give you some guidance shortly. It certainly isn't, however the optimal way to get work done.
First let me explain why this is a design failure. A critical task during development is building a test environment mocked up to full scale. If you fail to do this then you won't fully appreciate how to best design your queries to begin with.
Databases are all about reading data from disk. Yeah sure there is some writing too, but this is typically dwarfed a couple orders of magnitude by the task of reading the data. Databases are heavily IO bound.
Two things contribute to the time it takes to read table data or index pages: the "width" of each row and the quantity of rows. The width of a row is determined both by the number of columns as well as how much data each field contains.
Do you want fast reads? Use narrow tables and indexes (or use vertical partitioning). To limit the quantity of row reads, use horizontal partitioning or appropriate cluster indexing on tables with millions of rows. And guess what, you can't choose the best partitioning approach during development until you scale up.
Understand the above and you will have good baseline strategies for designing queries in the first place. Hence given the chance to "optimize" existing queries, a far better approach (if you have the time) is to refactor the application to take advantage of this knowledge.
Persist sub-queries to narrow temporary work tables. Target your query sequence to limit the selection criteria to the typically minimal rowset before finally joining in the "wide" information.
If you can't afford a full refactoring though, then you will have to resort to basic database query optimization techniques. Learn to understand the "explain plan" and to identify the time resource hogs. Use nolock hints where possible. Experiment with other hints to modify join order. Direct reads by a key are fast, full scans are slow, so build indices on columns used in joins.
Yeah you can soup up your Toyota Corolla queries to run at 160 miles per hour, but it would have been far better to design your queries like a Ferrari from the start.