Tuesday, February 3, 2015

Assignment 2 : Writing an introduction

Bootstrap Method for Streaming Data

Perasut  Rungcharassang


   Stage 1 :    Typical statistical methods work with static data sets. the static data set can be indicated as follows : the data set is unchanged (not depend on time), the size of the data set is fixed (can be stored), there is clearly distribution on that data set (such as normal distribution or uniformly distribution) and so on. The whole static data set will be calculated in order to obtain statistical values (mean, standard deviation, etc.). However, in the recent years, the format of the data set is changed. Many applications need to work with non-static data sets. This type of the non-static data set can be called as data stream or streaming data. Its property is opposite to the static data set.

  Stage 2 :    Efron (1979) introduced the bootstrap method which is a statistical tool for estimating statistical values. The Bootstrap method is a very simple method used to estimate the sampling distribution of a sample data. It generates many re-samples by sampling the original training data with replacement to represent the sampling distribution. The bootstrap method will be applied when we know little statistical information of the data set, there is only a small amount of the data set or standard methods cannot be applied. The bootstrap  method is used to handle in several problems such as the signal processing (Zoubir & Boashash, 1998; Zoubir & Iskander, 2007), class imbalance problem (Thanathamathee & Lursinsap 2013), etc.

  Stage 3 :    The original bootstrap method needs to use the whole data set in order to generate many re-samples. However data set may be huge, it will take more time to calculate and use more storage to store. Since the data set interested in this paper is streaming data, it cannot be calculated by the original bootstrap method with the whole streaming data.

 Stage 4&5 :  The purpose of this paper is to improve the original bootstrap method in order to apply to classifying streaming data

     

My comments on my friend's blogs : 


#1 
http://sornjarodoonsiri.blogspot.com/2015/01/introduction.html?showComment=1423053397044#c1028524126658686890
#2 
http://suwatthikul.blogspot.com/2015/02/assignment2-writing-introduction.html?showComment=1423055970697#c6274032821322178279

                    

4 comments:

  1. The purpose of this paper is to improve the original bootstrap method and apply to classifying streaming data in order to blend capabilities to real-time analysis of the system, with the ability to take immediate process-based action on the discovered insights.

    (I am not sure about the benefit of bootstrap method, I think that after "in order to" should be the value to others)

    ReplyDelete
  2. In my opinion: stage 1 "This type of the non-static data set can be..." I will cut "This type of" and change into " The non-static data set can be...."
    In stage 4&5 "in order to apply to classifying" may be I will cut "apply to".

    ReplyDelete