<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>http://pel.ucd.ie/index.php?action=history&amp;feed=atom&amp;title=24_June_2013</id>
	<title>24 June 2013 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="http://pel.ucd.ie/index.php?action=history&amp;feed=atom&amp;title=24_June_2013"/>
	<link rel="alternate" type="text/html" href="http://pel.ucd.ie/index.php?title=24_June_2013&amp;action=history"/>
	<updated>2026-04-15T02:29:58Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.35.1</generator>
	<entry>
		<id>http://pel.ucd.ie/index.php?title=24_June_2013&amp;diff=789&amp;oldid=prev</id>
		<title>Thomas at 09:43, 24 June 2013</title>
		<link rel="alternate" type="text/html" href="http://pel.ucd.ie/index.php?title=24_June_2013&amp;diff=789&amp;oldid=prev"/>
		<updated>2013-06-24T09:43:08Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en-GB&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 09:43, 24 June 2013&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l2&quot; &gt;Line 2:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 2:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Abstract: In a wide range of application areas (e.g. data mining, approximate query evaluation, histogram construction), database sampling has proved to be a powerful technique. It is generally used when the computational cost of processing large amounts of information is extremely high, and a faster response with a lower level of accuracy for the results is preferred. Previous sampling techniques achieve this balance, however, an evaluation of the cost of the database sampling process should be considered. We argue that the performance of current relational database sampling techniques that maintain the data integrity of the sample database is low and a faster strategy needs to be devised. In this paper we propose a very fast sampling method that maintains the referential integrity of the sample database intact. The sampling method targets the production environment of a system under development, that generally consists of large amounts of data computationally costly to analyze. We evaluate our method in comparison with previous database sampling approaches and show that our method produces a sample database at least 300 times faster and with a maximum trade off of 4.5% in terms of sample size error.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Abstract: In a wide range of application areas (e.g. data mining, approximate query evaluation, histogram construction), database sampling has proved to be a powerful technique. It is generally used when the computational cost of processing large amounts of information is extremely high, and a faster response with a lower level of accuracy for the results is preferred. Previous sampling techniques achieve this balance, however, an evaluation of the cost of the database sampling process should be considered. We argue that the performance of current relational database sampling techniques that maintain the data integrity of the sample database is low and a faster strategy needs to be devised. In this paper we propose a very fast sampling method that maintains the referential integrity of the sample database intact. The sampling method targets the production environment of a system under development, that generally consists of large amounts of data computationally costly to analyze. We evaluate our method in comparison with previous database sampling approaches and show that our method produces a sample database at least 300 times faster and with a maximum trade off of 4.5% in terms of sample size error.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;[[Category:news|2013 06 24]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Thomas</name></author>
	</entry>
	<entry>
		<id>http://pel.ucd.ie/index.php?title=24_June_2013&amp;diff=787&amp;oldid=prev</id>
		<title>Thomas: Created page with &quot;The paper 'VFDS: Very Fast Database Sampling System' by Teodora Sandra Buda, Thomas Cerqueus, John Murphy and Morten Kristiansen has been accepted for publication at IEEE IRI 201...&quot;</title>
		<link rel="alternate" type="text/html" href="http://pel.ucd.ie/index.php?title=24_June_2013&amp;diff=787&amp;oldid=prev"/>
		<updated>2013-06-24T09:40:38Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;The paper &amp;#039;VFDS: Very Fast Database Sampling System&amp;#039; by Teodora Sandra Buda, Thomas Cerqueus, John Murphy and Morten Kristiansen has been accepted for publication at IEEE IRI 201...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;The paper 'VFDS: Very Fast Database Sampling System' by Teodora Sandra Buda, Thomas Cerqueus, John Murphy and Morten Kristiansen has been accepted for publication at IEEE IRI 2013 (14th International Conference on Information Reuse and Integration).&lt;br /&gt;
&lt;br /&gt;
Abstract: In a wide range of application areas (e.g. data mining, approximate query evaluation, histogram construction), database sampling has proved to be a powerful technique. It is generally used when the computational cost of processing large amounts of information is extremely high, and a faster response with a lower level of accuracy for the results is preferred. Previous sampling techniques achieve this balance, however, an evaluation of the cost of the database sampling process should be considered. We argue that the performance of current relational database sampling techniques that maintain the data integrity of the sample database is low and a faster strategy needs to be devised. In this paper we propose a very fast sampling method that maintains the referential integrity of the sample database intact. The sampling method targets the production environment of a system under development, that generally consists of large amounts of data computationally costly to analyze. We evaluate our method in comparison with previous database sampling approaches and show that our method produces a sample database at least 300 times faster and with a maximum trade off of 4.5% in terms of sample size error.&lt;/div&gt;</summary>
		<author><name>Thomas</name></author>
	</entry>
</feed>