About sharing Talend development w/ version control

0


Per http://www.talendforge.org/forum/viewtopic.php?id=6876 (dated Jun 2009)

Q: Is it possible to configure Talend Open Studio with a repository on a shared drive??

A1: No, TOS can only be used on local machine. To share the project, you can use talend Integration Suite, which is the subscription extension of TOS, support multiple users, groups, specific access for these users, specifics privileges etc... For more info, please access our website.

A2: If your drive is mounted, you can try to change the workspace location by adding the "data" option in the shortcut : "C:\TOS\TOS-win32-x86.exe -data Z:\mySharedWorkspace"
This remote workspace will be locked by the first person who use it. So you won't be able to work at the same time with a coworker on this workspace : this option is only available in TIS.

Per http://stackoverflow.com/questions/4604823/talend-project-in-svn (dated Jan 2011)

Q: I need to use Talend Open Studio with a SVN. What is the folder that I should put in the repository?

A1: Well, the "official" answer is to upgrade to Talend Integration Suite (TIS), which includes SVN integration. It goes a long way to synchronizing the activities of several developers using a shared repository.

If TIS is not an option for you, you might have some success by putting the project directory of your Talend Open Studio (TOS) under SVN control. This is the directory immediately under "workspace" that has the same name as your project. You would have to perform all SVN operations, such as commit and update, manually on this directory, for example using Tortoise (if you're on Windows). TOS might be able to use this project as if it was not under SVN. I personally haven't tried this, and would recommend using TIS instead.

A2: I prepared the following for my team. http://www.dariopardo.com/talend-open-studio/svnintegration/ It basically involves doing what was said above. (note – this one uses Subversion repository w/ TortoiseSVN client)

A3: I faced a similar issue an wrote a blog about it, see Putting Talend Open Studio projects under version control. Basically you should version your entire workspace. (note – this one uses Git)

A4: Putting a TOS workspace under SVN is strongly discouraged, as CSV/SVN as file-based by nature while TOS workspace is directory-based. The probability that you incur in workspace corruption are very high.

However I achieved good results putting in under git, instead (have a look here for an idea). I must say it's not elegant and putting the entire workspace under version control is a tremendous waste of space (you will commit compiled files, logs, history, temp...), but it's the only safe solution for your metadata integrity.

Be aware! In *<workspace_root>/.java* subtree there are external libraries and the classpath files for each job in your project you executed at least one time. These paths are absolute. This mean that if you plan to use your version-controlled workspace in a shared environment anyone in team must place the workspace in the absolute location in their local file system (ie. c:/talend_git/workspace). Otherwise you'll get a class not found exception upon executing jobs. Sad but true.

So, my conclusion --- Talend version control for shared development is possible but unless we need to collaborate on Talend jobs frequently and are very skilled in SVN/Git, it can be tricky to incorporate all team members' Talend projects in a repository and once in a while take over for continuing development. The learning curve to incorporate Talend w/ version control software is likely high. As mentioned for answer A4 above, the chances of corrupted file is very high, and I know that can ruin the whole Talend project rather than just one Talend job).

What's the alternative --- just have a repository so everyone can export his/her own projects (or jobs) for others to grab as needed. We have /home/developers/Talend-jobs on dev.berkeley.edu (pahma-dev) we can use, or setup account on code@berkeley.edu (which I have difficulty learning it previously), or maybe the easiest is use Berkeley's "box".

Cycle Handling in PostGreSQL

0
Below is the way to handle Cycle data in Postgresql database

 CREATE TABLE FOLDER(ID INT , NAME VARCHAR(255), PARENT INT);

INSERT INTO FOLDER VALUES(1, '/', null);
INSERT INTO FOLDER VALUES(2, 'src', 1);
INSERT INTO FOLDER VALUES(3, 'main', 2);
INSERT INTO FOLDER VALUES(4, 'org', 3);
INSERT INTO FOLDER VALUES(5, 'test', 2);


With RECURSIVE CTE(Id,NAME,Parent,Pathh,path, cycle)
as
(
Select Id,NAME,parent, cast(Id  as varchar(30)) pathh,ARRAY[Folder.Id] path,false from Folder
where parent is null
union all
Select 
F.id,F.name,F.Parent,cast(F.Id||'-->'||C.Pathh as varchar(30))  pathh,
C.Path||F.id path,F.id = any(path) as cycle
from 
Folder F,CTE C
where F.Parent=C.ID  and not cycle
)
Select C.* from CTE C
order by 1 


ORACLE PLSQL Exception Handling

0


SQL> create table a (x number (10), y number(10);
Table created.

SQL> insert into a values(1, 5);
SQL> insert into a values(2, 10);
SQL> insert into a values(3, 15);
SQL> insert into a values(3, 0);
SQL> insert into a values(4, 20);
SQL> insert into a values(5, 25);
SQL> commit;
SQL> select x, y from a order by x;

X Y
---------- ----------
1 5
2 10
3 15
3 0
4 20
5 25
SQL>
declare
v_temp number:= 30;
v_tot number;
v_x number;
v_y number;
cursor c_temp is
select x, y from a order by x;
begin
open c_temp;
loop
fetch c_temp into v_x, v_y;
exit when c_temp%notfound;
v_tot:=v_x+v_temp/v_y;
dbms_output.put_line('Total is :'||v_tot);
end loop;
close c_temp;
exception
when others then
dbms_output.put_line('Found Oracle error: '||SQLERRM);
end;
/

Total is :7
Total is :5
Total is :5
Found Oracle error: ORA-01476: divisor is equal to zero

PL/SQL procedure successfully completed.

Only Partial Records are Printed. 

Below Code Prints : Complete Numbers 

declare
  v_temp number:= 30;
  v_tot number;
  v_x number;
  v_y number;
  cursor c_temp is
    select x, y from a order by x;
begin
  open c_temp;
  loop
    fetch c_temp into v_x, v_y;
    exit when c_temp%notfound;
    begin
      v_tot:=v_x+v_temp/v_y;
      dbms_output.put_line('Total is :'||v_tot);
    exception
      when others then
      dbms_output.put_line('Found Oracle error: '||SQLERRM);
    end;
  end loop;
  close c_temp;
end;

PLSQL / Works Cursors with Exception Handling

0
Simple Stored Procedure to Load the Cursor Dynamically:

create or replace procedure asf_cursor is 
TYPE EmpCurTyp  IS REF CURSOR;
c2  EmpCurTyp;
cursor C1 is (select 1 a from dual union all select 2 a from dual);
dynamic_query varchar2(1000);
i int;
var1 varchar(100);
BEGIN
open c1;
loop
fetch c1 into i;
EXIT WHEN c1%NOTFOUND;
dynamic_query:= 'select '''||'~'||i||'''  from dual';
open c2 for dynamic_query;
  LOOP
    FETCH c2 INTO var1;
    EXIT WHEN c2%NOTFOUND;
  END LOOP;
close c2;
DBMS_OUTPUT.PUT_LINE(var1);
end loop;
close c1;
END; 

set serveroutput on 
execute asf_cursor;

 

Exception Handling for Table Not Found:

-- set serveroutput on;
DECLARE 
TYPE EmpCurTyp  IS REF CURSOR;
c2  EmpCurTyp;
var1 varchar2(30);
drop_tab varchar2(100):='Select * from duall union all Select cast(2 as varchar(5)) from duall';
clob_out clob;
BEGIN
/*
open c2 for select * from duall;
LOOP
    FETCH c2 INTO var1;
    EXIT WHEN c2%NOTFOUND;
    dbms_output.put_line(var1);
    close c2;
  END LOOP;
  */
  open c2 for drop_tab;
  loop
  fetch c2 into var1;
   EXIT WHEN c2%NOTFOUND;
   clob_out:=clob_out||var1;
   end loop;
   close c2;
   dbms_output.put_line(clob_out);
  EXCEPTION
WHEN OTHERS THEN 
IF (SQLCODE = -942) THEN
DBMS_OUTPUT.PUT_LINE('Table Not Found');
clob_out:='~'||null;
dbms_output.put_line(clob_out);
END IF;
END;

Setting Different Java Version for Eclipse

0
  • Open eclipse config file eclipse.ini in your Eclipse folder.
  • Add -vm yourPath\Java\jre6\bin\javaw.exe like:

    -startup  plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar  --launcher.library  plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.200.v20120522-1813  -product  org.eclipse.epp.package.java.product  --launcher.defaultAction  openFile  --launcher.XXMaxPermSize  256M  -vm   C:\Java\jre6\bin\javaw.exe  ...

NT Authentication In Talend

0
The solution is to place ntlmauth.dll in windows/system32 (or sysWow64) folder. but instead of placing it in system32 folder, we can place it in our own local folder and loaded by our talend jobs while establishing a connection to MS SQL Server.


Design time configuration:

1. Copy ntlmauth.dll to bin folder inside talend installed location. bin folder wont be present, create your own.

2. While executing the job, go to Advanced settings and select "Use Specific JVM arguments" and add the following additional argument

-Djava.library.path=./bin

3. Execute the talend job, it will use windows authentication

Inline image 1


Runtime configuration

1. While creating the package, ensure "bin" folder is also bundled along with ntlmauth.dll
2. Edit the .bat or .sh file and add the following entry

-Djava.library.path=./bin


Inline image 2

3. Execute the batch file, the job should run without any SSO error.

Talend Console logs to file

0

Redirecting Talend Console logs to file

Talend Open studio for data integration does not have any component to redirect console trace to a file. This blog post provides the solution for Redirecting Talend Console logs to file and is very easy to follow and extend. In previous article logging-using-talend we have seen how to customize Talend logs. This article shows how to redirect Talend console logs to file. In talend exchange, there is a custom component to perform similar job.

Solution



Redirecting Talend console logs is required by almost all Talend projects. This is very easy to implement in talend. Use tJava component at the start of any job and enter following lines into it.

Redirecting Talend Console logs

Redirecting Talend Console logs

java.io.File folder = new java.io.File("_logs/");
if (!folder.exists()) {
if (folder.mkdir()) {
System.out.println("Directory is created!");

}
else {
System.out.println("Failed to create directory!");
}
}
java.io.File logFile = new java.io.File("_log/"+projectName+"_.log");
java.io.PrintStream ts = new java.io.PrintStream(new java.io.FileOutputStream(logFile));
System.setOut(ts);
System.setErr(ts);

After executing the job, above code creates a flat file with "projectName.log" and starts writing data which appears on the console. You can create a new file at each execution by post/pre fixing with the time stamp in the file name or use following code. Attached image shows

java.io.File file = new java.io.File(projectName+"_"+TalendDate.getDate("CCYYMMDDhhmmss")+".log");

projectName is the variable name provided by the Talend and you can get the postfix date stamp by another variable TalendDate.getDate("CCYY-MM-DD hh:mm:ss"). Remove all the special characters in the time stamp. ProjectName can also be replaced by another global variable jobName.

After successfully executing this job, you can see ProjectName_TimeStamp.log  file is created in the root folder of Talend. This file contains complete console stack trace.

Drawback



 Drawback of implementing this strategy is that, it will not print or display anything on the console while using IDE or independent execution. This becomes difficult when developer is developing or debugging the job. Because, it is an extra effort for a developer to open a file every time to find out whether the job is executed successfully and generated error or not.

Talend Logging

0
java.io.File file = new java.io.File("C:/temp/myTalendLog.txt");
java.io.PrintStream ps = new java.io.PrintStream(new java.io.FileOutputStream(file));
System.setOut(ps);