본문 바로가기

ETL/Pentaho

Pentaho pan

정의

원격으로 tranformation 을 실행시키는 모듈이다. 

설치

data-intergration( carte ) 설치시  bundle 로 들어가 있다.

설정

repository 를 다음처럼 두가지로 나뉠 수 있는데 이게 carte 하고는 또 다름.. kitchen 은 또 다름

  • file repository
  • database repository

일반적으로 $HOME/.kettle/repository.xml 에 다수의 repository 를 등록할 수 있지만

pan.sh 은 CURRENT_PATH/repogitories.xml 을 인식함

repogitories.xml 의 내용은 carte 서버에서 설정한 $HOME/.kettle/repository.xml 과 동일하다.

실행

repository list

[bos@tlog-kafka03 data-integration]$ ./pan.sh -listrep
... 중략 ...
2020/11/27 15:53:52 - Start of run.
2020/11/27 15:53:52 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
#1 : postgres [Database repository]
#2 : file_repository [Pentaho repository | http://localhost:8080/pentaho]

transformation list


[bos@tlog-etl carte]$ pan.sh -rep=postgres -user=admin -pass=admin -listtrans
2020/11/27 16:24:27 - Start of run.
2020/11/27 16:24:27 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
Transformation  <-transformatoin 이름

logfile

특이점은 중간로그는 기록이 되는데 결과값은 로그파일에 기록이 되지 않음

[bos@tlog-etl carte]$ pan.sh -rep=postgres -user=admin -pass=admin -listtrans -logfile=trans.out
2020/11/27 16:32:25 - Start of run.
2020/11/27 16:32:25 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
Transformation
[bos@tlog-etl carte]$ cat trans.out
2020/11/27 16:28:09 - Start of run.
2020/11/27 16:28:09 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
2020/11/27 16:29:03 - Logging is at level : Row Level (very detailed)
2020/11/27 16:29:03 - Start of run.
2020/11/27 16:29:03 - Allocate new transformation.
2020/11/27 16:29:03 - Starting to look at options...
2020/11/27 16:29:03 - Parsing command line options.
2020/11/27 16:29:03 - Loading available repositories.
2020/11/27 16:29:03 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
2020/11/27 16:29:03 - RepositoriesMeta - We have 1 connections...
2020/11/27 16:29:03 - RepositoriesMeta - Looking at connection #0
2020/11/27 16:29:03 - RepositoriesMeta - Read at connection: postgres19
2020/11/27 16:29:03 - RepositoriesMeta - We have 2 repositories...
2020/11/27 16:29:03 - RepositoriesMeta - Looking at repository #0
2020/11/27 16:29:03 - RepositoriesMeta - Read at repository: postgres
2020/11/27 16:29:03 - RepositoriesMeta - Looking at repository #1
2020/11/27 16:29:03 - RepositoriesMeta - Read at repository: file_repository
2020/11/27 16:29:03 - Finding repository [postgres]
2020/11/27 16:29:03 - Check supplied username and password.
2020/11/27 16:29:03 - Allocate & connect to repository.
2020/11/27 16:29:03 - Getting list of transformations in directory: /
2020/11/27 16:31:18 - Logging is at level : Row Level (very detailed)
2020/11/27 16:31:19 - Start of run.
2020/11/27 16:31:19 - Allocate new transformation.
2020/11/27 16:31:19 - Starting to look at options...
2020/11/27 16:31:19 - Parsing command line options.
2020/11/27 16:31:19 - Loading available repositories.
2020/11/27 16:31:19 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
2020/11/27 16:31:19 - RepositoriesMeta - We have 1 connections...
2020/11/27 16:31:19 - RepositoriesMeta - Looking at connection #0
2020/11/27 16:31:19 - RepositoriesMeta - Read at connection: postgres19
2020/11/27 16:31:19 - RepositoriesMeta - We have 2 repositories...
2020/11/27 16:31:19 - RepositoriesMeta - Looking at repository #0
2020/11/27 16:31:19 - RepositoriesMeta - Read at repository: postgres
2020/11/27 16:31:19 - RepositoriesMeta - Looking at repository #1
2020/11/27 16:31:19 - RepositoriesMeta - Read at repository: file_repository
2020/11/27 16:31:19 - Finding repository [postgres]
2020/11/27 16:31:19 - Check supplied username and password.
2020/11/27 16:31:19 - Allocate & connect to repository.
2020/11/27 16:31:19 - Getting list of transformations in directory: /
2020/11/27 16:32:25 - Start of run.
2020/11/27 16:32:25 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml

 

log level

[bos@tlog-etl carte]$ pan.sh -rep=postgres -user=admin -pass=admin -listtrans -level=Rowlevel -logfile=trans.out
2020/11/27 16:31:18 - Logging is at level : Row Level (very detailed)
Arguments:
rep          : postgres
user         : admin
trustuser    :
pass         : admin
trans        :
dir          :
file         :
level        : Rowlevel
logfile      : trans.out
log          :
listdir      :
listtrans    : Y
listrep      :
exprep       :
norep        :
safemode     :
version      :
jarfile      :
param        : null
listparam    :
initialDir   : /home/bos/carte/
stepname     :
copynum      :
zip          :
uuid         :
metrics      :
maxloglines  :
maxlogtimeou :
 
2020/11/27 16:31:19 - Start of run.
2020/11/27 16:31:19 - Allocate new transformation.
2020/11/27 16:31:19 - Starting to look at options...
2020/11/27 16:31:19 - Parsing command line options.
2020/11/27 16:31:19 - Loading available repositories.
2020/11/27 16:31:19 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
2020/11/27 16:31:19 - RepositoriesMeta - We have 1 connections...
2020/11/27 16:31:19 - RepositoriesMeta - Looking at connection #0
2020/11/27 16:31:19 - RepositoriesMeta - Read at connection: postgres19
2020/11/27 16:31:19 - RepositoriesMeta - We have 2 repositories...
2020/11/27 16:31:19 - RepositoriesMeta - Looking at repository #0
2020/11/27 16:31:19 - RepositoriesMeta - Read at repository: postgres
2020/11/27 16:31:19 - RepositoriesMeta - Looking at repository #1
2020/11/27 16:31:19 - RepositoriesMeta - Read at repository: file_repository
2020/11/27 16:31:19 - Finding repository [postgres]
2020/11/27 16:31:19 - Check supplied username and password.
2020/11/27 16:31:19 - Allocate & connect to repository.
2020/11/27 16:31:19 - Getting list of transformations in directory: /
Transformation

execute transformation

[bos@tlog-etl carte]$ pan.sh -rep=postgres -user=admin -pass=admin -trans=Transformation
2020/11/27 16:34:55 - Start of run.
2020/11/27 16:34:55 - RepositoriesMeta - Reading repositories XML file: /home/bos/packages/data-integration/repositories.xml
2020/11/27 16:34:56 - Transformation - Dispatching started for transformation [Transformation]
2020/11/27 16:34:56 - Generate random value.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0)
2020/11/27 16:34:56 - Text file output.0 - Finished processing (I=0, O=2, R=1, W=1, U=0, E=0)
2020/11/27 16:34:56 - Carte - Installing timer to purge stale objects after 1440 minutes.
2020/11/27 16:34:56 - Finished!
2020/11/27 16:34:56 - Start=2020/11/27 16:34:56.356, Stop=2020/11/27 16:34:56.593
2020/11/27 16:34:56 - Processing ended after 0 seconds.
2020/11/27 16:34:56 - Transformation - 
2020/11/27 16:34:56 - Transformation - Step Generate random value.0 ended successfully, processed 1 lines. ( - lines/s)
2020/11/27 16:34:56 - Transformation - Step Text file output.0 ended successfully, processed 1 lines. ( - lines/s)

database repository 에 연결하는 경우 tranformation 의 실행 기록을 남기게 되는데 처음 실행할때 시작 시간이 이상함

Reference

'ETL > Pentaho' 카테고리의 다른 글

Pentaho 구조 및 특징  (0) 2021.10.01
Pentaho server  (0) 2021.10.01
Pentaho kitchen  (0) 2021.10.01
Pentaho carte server  (0) 2021.10.01