Re: (newbie) SQL help and how to store it by David
David
Sat Dec 06 16:48:17 CST 2003
This is a fairly long post, but I've offered an answer to your question
in the first couple of paragraphs. Feel free to skip the rest, which is
me musing...
On 2003-12-05, Steve Klett <sklett_2222222@yahoo.com> wrote:
> Hi-
> this is not a complicated question and probably get's asked often....
>
> with that said, here is what I am trying to do. I have 3 tables
>
> table1 (products)
> ProdId
> ProdName
> ProdParentId
>
> table2 (faq_categories)
> CatId
> ProdId
> CatName
>
> table 3 (faq_main)
> FaqId
> CatId
> Question
> Answer
>
>
> What I want to do is get join all the "faq_categories" that match certain
> criteria in "products", then join all the "faq_main" that match the relevant
> faq_categories"
You know, IMHO that's actually a really, really good question, and one
I've been thinking a lot about lately in general terms.
Traditionally, the answer's pretty simple:
SELECT p.ProdName, c.CatName, f.Question, f.Answer
FROM products p
INNER JOIN faq_categories c ON p.ProdId = c.ProdId
INNER JOIN faq_main f ON c.CatId = f.CatId
WHERE p.ProdName LIKE 'Microsoft%'
ORDER BY p.ProdName, c.CatName
Then just run through the returned rowset. When ProdName changes, you
have a new first-order item, when CatName changes, you have a new
second-order item, etc.
// untested pseudocode
prodName = "";
lastCat = "";
while(dr.Read())
{
if(dr(ProdName) != lastProd)
{
lastProd = dr(ProdName)
Console.WriteLine(dr(ProdName))
Console.WriteLine("\t" + dr(CatName));
}
else if(dr(CatName) != lastCat)
{
lastCat = dr(CatName);
Console.WriteLine("\t" + dr(CatName));
}
Console.WriteLine("\t\t" + dr(Question));
Console.WriteLine("\t\t" + dr(Answer));
}
There's some real advantages to this. Databases are generally optimized
for this kind of query, you're only hitting the database once, and both
the code and the SQL are real easy to follow for anyone who's ever
worked with a database in virtually any language.
But...
Is this really the best strategy for ADO.NET? The disadvantage to the
above is that you pull down lots of extraneous data, for instance the
Product Name is in every single row even if you have only one product.
(Of course, this might not matter very much depending on the size of the
query. For example, if you're rendering into a web page, you probably
don't send a lot of data out to the user anyway).
Ideally, with ADO.NET we might want to pull down only the data we need
into a DataSet, map the tables to what we want, then define relations on
that DataSet to create our tree, but how?
CREATE PROC GetQuestions
SELECT ProdID, ProdName FROM Products WHERE ProdName like 'Microsoft%'
SELECT c.ProdID, c.CatID, c.CatName FROM Products p
INNER JOIN faq_categories ON p.ProdID = c.ProdID
WHERE p.ProdName like 'Microsoft%'
SELECT f.Question, f.Answer FROM Products p
INNER JOIN faq_categories c ON p.ProdId = c.ProdId ...
/* and so on... */
GO
Yuck. We can do the text lookup multiple times, once for each level in
the hierarchy, but since the text lookup is probably the least efficient
thing in the original query, repeating is likely to destroy all our nice
attempts at performance (although SQL Server might cache such a thing,
I'm not sure).
Or we have to stash the original keys in an array or temp table
someplace, and have the stored proc do the queries from that. But not
only are arrays a major pain in T-SQL and hard to maintain, but they
aren't particularly efficient to query on. Of course, storing the array
back in the middle tier makes the code easier to follow and maintain,
but now we've replace one nice JOIN query with three fairly inefficient
round trips to our database server.
Or am I missing something simpler? I dunno. I'm curious how people are
handling this type of situation. It's a pretty basic problem that
constantly pops up, but I've not seen it addressed anywhere. Any
thoughts? Or even better, is there a book or a whitepaper someplace
that discusses and (hopefully) benchmarks the alternatives?
--
David
dfoster at
hotpop dot com