Acquisitional query processing(ACQP)
. where
. when
. how often (sample, periodic)
. query processing operator
. locations and costs of data
. reduce power consumption(low-power, energe efficient)
1. query language
2. query parsing
3. query optimization
4. query dissemination(대상이 되는 sensor들에게 분배)
5. query execution(결과를 얻어옴)
TinyDB
. a distributed query processor for smart sensor devices
. on TinyOS
. on Berkeley Mica mote platform
Smart sensor device
. large, distributed network
. wireless, battery-powered
. autonomous node
. without human interaction
. lifetime : weeks~months(long term)
. monitering, collect data
. aggregation operations
. filtering operations
. push-down techniques
. minimize communication
. indexing
. stream processiong
. approximate query answering
. routing tree
. query propagation(forward)
. basestation - the root of the network
. hop-count(= level) - distance from the root
. parent : root에 한 단계 더 가까운 node
. multi-parent = different root = simultaneous query
. environmental monitoring deployment
. earthquake monitoring
. building infrastructure monitoring
. occupancy
. coming or going
. humidity, temperature
. battery나 sensor를 한 번 교체하려면 돈이 매우 많이 듬.
Mica mote
. duty cycle
. 100% - a few days
. 2%(1.2sec/min) - 6 month(2 AA battery)
. 4Mhz, 8bit Atmel
. RFM TR1000 radio
. range : 1~100 feet
. 40 kbits/sec single shared CSMA channel
. 1048bytes messages(TinyDB default)
. energy consumption
. 1 bit data transmit = 1,000 CPU instruction
. 32kHz clock : +-1ms accuracy with neighbor
. mode
. snoozing mode - idle, waiting for timer(or external event)
. processing mode - local query result
. processing and receiving mode - collect result
. transmitting mode - deliver data
. communication
. multi-hop = relay
. broadcast
. snoop - message를 감시함
. link-level(lan card나 device레벨) ack
. No end-to-end ack
. ad-hoc - topologies automatically discovered
. ACQ Language
. SELECT-FROM-WHERE
. selection, join, projection, aggregation
. sampling, windowing, sub-queries via materialization point
. sample interval
. epoch - 한 sample기간내
. one column per sensor type
. virtual table = unbounded
. output - timestamp, data
. materialization point?
. materialized view?
. http://www.databasejournal.com/features/oracle/article.php/2192071
. http://www.cs.ncl.ac.uk/teaching/facilities/swdoc/oracle9i/server.920/a96540/statements_63a.htm
. a database object that contains the result of a query
. local copies of data located remotely
. copies of remote data on your local node
. read-only
. storage point?
. landmark query?
. query id
. issuer가 query id를 가지고 stop 명령을 내릴 수도 있음.
. occupancy detection
. sound volume을 통해서 알아냄.
. event based query
. polling based trigger가 event based trigger보다 에너지를 많이 쓴다.
. lifetime based query
. lifetime estimation
. 사람들은 사실 power comsumption이나 sampling rate 같은 복잡한 특성이나
기능은 잘 모르고 단지 lifetime에만 관심있을 뿐이다.
따라서 lifetime을 입력하면 적절히 자동으로 estimate해서 돌아야 한다.
. costs of accessing sensors,
selectivities of operators,
expected communication rates,
current battery voltage
. transmission rate : 1시간에 몇 번 전송할 것인가
= sampling rate
. 수식으로 예상한 시간과 실제 실험 시간이 거의 일치했다.
. power-aware optimization
. basestation(root) - query를 parsing하고 optimize하는 곳이기도 함.
. metadata management - 주기적으로 basestation에 정보 저장
. metadata - power, sample time, constant?, rate of change, range
. aggregation
. monotonic - 단순히 증가
. exemplary(summary) - aggregation set에서 어떤 대표값을 뽑음.
. query의 종류에 따라 cost계산
. ordering of sampling and predicates
. P = S + T
. P : Predicate
. S : set of operations
. jobs
. exemplary aggregate pushdown
. selectivity를 이용해서 잘 reorder해서 power를 아끼자.
. series parallel graph?
http://www.algorithmic-solutions.info/leda_guide/graph_algorithms/series-parallel.html
. event query batching
. multi-query optimization technique
. sliding window join
. stream join = rewritten queries, event queue empty check를 함.
. asynch events = non-rewritten queries
. event가 많으면 stream join이 asynch events보다 빠르다.
. sample의 phase를 잘 맞춰야 reordering이 쉽다.
. phase가 맞지않아도 user는 별 상관없으니 굳이 맞춰야 한다면
oversampling을 한다.
. power sensitive dissemination and routing
. query result를 구하는 데, 꼭 필요한 node들에만
routing하는 것이 중요함.
. SRT : Semantic routing trees
. link quality based parent selection algorithm
. SRT build request를 먼저 날림
. query에 참여하는 node들의 id가 각 parent node에 기록됨.
. parent selection
. random
. cloest-parent
. clustered : 다른 sibiling들과 같은 parent를 선택하여 spread를 줄인다.
. distribution
. random distribution
. geographic distribution
. neighbor의 위치과 correlation되어서 배치
. boundary
. best-case
. no SRT
. 결론 : clustered가 가장 낫다.
. processing query
. 모든 node가 동시에 sleep, awake(time syncronized)
. aggregation
. temporal aggregate
. aggregates in the same epoch
. partial state record
. 각 intermediate node에서 aggregate를 위한 중간 집계를 미리하므로
root까지 전송되는 data양이 줄어든다.
. prioritizing data delivery
. queue가 overflow되면 data를 discard해야 한다.
. priority를 정해서 중요한, quality가 높은 정보를 먼저 보낸다.
. runtime decision이 필요하다.
. RMS(Root mean square)등의 방법을 쓸 수 있다.
. scheme
. naive : FIFO
. winavg : 처음 두개를 average해서 뒤에 공간을 하나 만듬.
. delta : largest change를 report
. delta의 RMS에러가 가장 적다.
. adapting rates and power consumption
. network contention
. node의 갯수가 늘고 sample이 늘어나면 network이 가득차서
전송이 안된다. 많이 보내봤자 성공률은 오히려 떨어진다.
따라서 adaptive하게 조절하지 해야 한다.
. power consumption
. reestimation
. adaptation optimizer
. ordering decision
. frequency
. TinyDB에서는 유저가 요청할 때만 reestimation이 일어난다.
. estimation의 가능성 : predicated batter voltage - decay linearly
. relative work
. laptop and palmtop은 mote와 달리 rapid power-cycling이 힘들다.
하는 일이 단순 데이터 수집보다 훨씬 복잡하고
갑자기 전원을 꺼버린다면 user가 적응하지 못할 것이다.
. SRT = indexing in traditional DB
. compressed database
. event based query
. approximate and best effort caches
. future work
. reoptimization
. burst of data
. unexpected power consumption
. users' prioritization preference