Hello All,

Does anyone have any suggestions on working with large datasets (10GB -
100GB)?

Best initialization?
Transfer from dataset to database?

Anything would be helpful, articles, code examples, discussions, etc

Thanks In Advance

Re: Large Datasets by Miha

Miha
Sun Dec 05 00:37:04 CST 2004

Hi,

Is there really a need to?
Can't you use just a part of it at a time?

--
Miha Markic [MVP C#] - RightHand .NET consulting & development
SLODUG - Slovene Developer Users Group
www.rthand.com

"Extreme Datasets" <Extreme Datasets@discussions.microsoft.com> wrote in
message news:95216E1E-8B90-4246-B369-95D04DC8BEED@microsoft.com...
> Hello All,
>
> Does anyone have any suggestions on working with large datasets (10GB -
> 100GB)?
>
> Best initialization?
> Transfer from dataset to database?
>
> Anything would be helpful, articles, code examples, discussions, etc
>
> Thanks In Advance



Re: Large Datasets by Extreme

Extreme
Sun Dec 05 02:45:02 CST 2004

Let's just say in theory that it is neccesary, any recommendations?

"Miha Markic [MVP C#]" wrote:

> Hi,
>
> Is there really a need to?
> Can't you use just a part of it at a time?
>
> --
> Miha Markic [MVP C#] - RightHand .NET consulting & development
> SLODUG - Slovene Developer Users Group
> www.rthand.com
>
> "Extreme Datasets" <Extreme Datasets@discussions.microsoft.com> wrote in
> message news:95216E1E-8B90-4246-B369-95D04DC8BEED@microsoft.com...
> > Hello All,
> >
> > Does anyone have any suggestions on working with large datasets (10GB -
> > 100GB)?
> >
> > Best initialization?
> > Transfer from dataset to database?
> >
> > Anything would be helpful, articles, code examples, discussions, etc
> >
> > Thanks In Advance
>
>
>

Re: Large Datasets by Sahil

Sahil
Sun Dec 05 10:37:06 CST 2004

System.Data.Dataset won't do for such large amount of data - that is just
not what it is meant to do. You would have to write your own class.

One approach is discussed here -
http://groups.google.com/groups?q=Viewing+large+amounts+of+data&hl=en&lr=&c2coff=1&selm=uHiqD%24U2EHA.3596%40TK2MSFTNGP12.phx.gbl&rnum=2

- Sahil Malik
http://dotnetjunkies.com/weblog/sahilmalik




"Extreme Datasets" <Extreme Datasets@discussions.microsoft.com> wrote in
message news:7C02C7C3-81B7-4BA4-88AA-89E0E4F2CC13@microsoft.com...
> Let's just say in theory that it is neccesary, any recommendations?
>
> "Miha Markic [MVP C#]" wrote:
>
>> Hi,
>>
>> Is there really a need to?
>> Can't you use just a part of it at a time?
>>
>> --
>> Miha Markic [MVP C#] - RightHand .NET consulting & development
>> SLODUG - Slovene Developer Users Group
>> www.rthand.com
>>
>> "Extreme Datasets" <Extreme Datasets@discussions.microsoft.com> wrote in
>> message news:95216E1E-8B90-4246-B369-95D04DC8BEED@microsoft.com...
>> > Hello All,
>> >
>> > Does anyone have any suggestions on working with large datasets (10GB -
>> > 100GB)?
>> >
>> > Best initialization?
>> > Transfer from dataset to database?
>> >
>> > Anything would be helpful, articles, code examples, discussions, etc
>> >
>> > Thanks In Advance
>>
>>
>>



Re: Large Datasets by Kawarjit

Kawarjit
Sun Dec 05 20:04:51 CST 2004

In .Net 2.0 the DataSet and it's related classes have been significantly
extended to scale and perform for large number of rows. However 10GB - 100GB
is kind of large for hosting in a single in-memory data structure. Are you
considering using 64 bit machines, as simple 32 bit machines may only go
upto 4GB?.

It'd help if you give more details, for instance
1. Are you considering partitioning data across multiple systems or does it
have to be constrained to a single system ?
2. Does the complete 10GB - 100GB data needs to be cached in main memory or
can it be paged in from secondary storage
3. What is the performance requirement, is it more around insert, update and
delete or querying? is it a mix of all of these
4. What is the kind of querying support in terms of complexity of
expressions and performance you are looking for.

you may want to take a look at the following article that describes some of
the dataset related enhancements in .NET 2.0
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnadonet/html/datasetenhance.asp

Thanks,
Kawarjit Bedi

Program Manager - ADO.NET Team
Microsoft Corp.

This posting is provided "AS IS" with no warranties, and confers no rights.


"Extreme Datasets" <Extreme Datasets@discussions.microsoft.com> wrote in
message news:95216E1E-8B90-4246-B369-95D04DC8BEED@microsoft.com...
> Hello All,
>
> Does anyone have any suggestions on working with large datasets (10GB -
> 100GB)?
>
> Best initialization?
> Transfer from dataset to database?
>
> Anything would be helpful, articles, code examples, discussions, etc
>
> Thanks In Advance



Re: Large Datasets by Extreme

Extreme
Sun Dec 05 21:15:01 CST 2004

Hello Kawarjit,

Thanks for the in-depth analysis.

1. Are you considering partitioning data across multiple systems or does it
have to be constrained to a single system ?

Multiple systems->What would be the best partitioning strategy?

2. Does the complete 10GB - 100GB data needs to be cached in main memory or
can it be paged in from secondary storage?

Main mem would be nice but probably not possible.->What would you suggest
for secondary storage?

3. What is the performance requirement, is it more around insert, update and
delete or querying? is it a mix of all of these

Query speed would be vital, this would be the only operation done.

4. What is the kind of querying support in terms of complexity of
expressions and performance you are looking for.

Mild complexity, again query performance would be vital.

I really appreciate your feedback Kawarjit.

Thanks