Manage your projects with version control
You can use version control to manage development of a data project, and to keep track of changes.
When working with version control, you can commit versions of the projects during design. This allows you to revert changes, and to see changes between two versions of the project.
You can also develop your data projects using a branching strategy. This lets you work on an isolated version of the project in each work area, or branch. The work area can be shared by several users. You can then merge your changes from the work area to a main branch for deployment to production.
GitHub is used as provider for version control.
Getting started
-
Create a user in GitHub that the tenant can use to access GitHub. This may already be created for you by an admin.
The user must have the following scopes:
-
repo
-
read:org
-
read:user
-
read:project
-
-
You need write access on the repositories you plan to change.
-
You must create a GitHub personal access token (classic). Fine-grained personal access tokens are not supported.
For more information, see GitHub documentation: Managing your personal access tokens.
-
Organization is mandatory in GitHub configuration.
-
You need Can edit role in the space where the project resides to perform version control actions.
-
Before you can start using version control, you must set up a configuration to connect to GitHub with the GitHub user that you created.
-
When you have set up a connection to GitHub, you can connect a project to a repository.
Setting up configuration to GitHub
All users that want to work with version control must set up a configuration to connect to GitHub using a GitHub user account.
You can configure GitHub in Projects. Make sure that you have prepared according to Getting started.
-
Click and then GitHub configuration.
-
Set up authentication using your organization and the GitHub personal access token described in Getting started.
-
Click OK.
You can now connect your projects to a repository.
Connecting a project to a repository
You must connect a project with a repository before you can start using version control. Make sure that you have set up a connection to GitHub.
-
In Projects, click ... on a project and select Connect to version control.
-
Select which repository to associate the project with.
-
Add a base directory path.
If you want to connect to an existing project in GitHub, you must use the same path.
-
You can select to commit the project and push the project to the remote repository after connection. Enter a commit message.
If you do not commit and push, a main branch will be created in the work area, but not in the remote repository.
-
Click Connect.
The project is now connected to the selected repository. This is indicated at the bottom of the project card with , the name of the repository, and the current branch.
When you open the project, the title row will now contain a GitHub menu with version control options. The project name will also be appended with the name of the current branch.
Developing a project with version control
You can use version control with different approaches:
-
Working directly on the main branch. This is mainly suitable for a single developer on a project who wants to keep track of changes, but can also be used by a group of developers that work in sync.
-
Working with a branching strategy, where multiple developers can contribute. You can also create branches to isolate new features or changes from each other.
Simplified workflow for a single-developer project
You can work directly on the main branch of a project. This approach is simpler and contains fewer operations, but still lets you keep track of changes. If there are more than one developer, they need to be careful to be in sync.
When you have made changes to the project that you want to apply, just make a Commit and Push.
Workflow for a multi-developer project
This workflow can be used if there are more than one developer working on a project, or if you want to isolate changes. This involves creating a development branch that you can share with other users. With this workflow, the developers can keep track of each others changes, and decide when to merge the changes to the main branch.
-
Create a new branch
Create a new development branch from main branch. You can share the branch with more users.
-
Develop
Make all required changes in the project.
Information noteDatabase schemas and connections are not maintained in version control. -
Apply remote changes
Apply remote changes from another branch to your work area to make sure that you are up to date with changes from the other branch. This is helpful to avoid or mitigate conflicting changes.
If you have two branches containing changes that may conflict, a workaround is to:
-
Commit changes to both work areas.
-
Merge both branches to main.
-
Apply remote changes again.
-
-
Commit and push
Commit and push your changes to the development branch. All objects will be pushed, so it is a good idea to validate your project before committing it.
-
Open a pull request and merge
When you are ready with development, it is time to merge the changes from the work area to the main branch. Merging a development branch to the main branch must be performed in GitHub by opening a pull request. You can set up approval for the branch to be merged to the main branch. For more information, see GitHub Pull requests documentation
Creating a new branch
-
In Projects, click ... on a project and select Create new branch.
The project must be connected to version control.
-
Select to create a branch from the main branch.
-
Enter a name for the branch.
-
Set a prefix to be added to all schemas in the project in Branch prefix for all schemas. This allows all schemas to be uniquely named to avoid naming conflicts.
-
Click Create
A new branch is created from main, and checked out from the repository. The branch contains a new version of the project, with all tasks in status New. That is, they are not prepared and do not contain any data.
Applying remote changes
You can apply changes from the remote repository to your work area. This could be changes created outside Qlik Cloud integrated version control, for example in GitHub or by other tools. This helps you to avoid conflicts when you want to commit and push your changes to the branch.
-
In Projects, click ... on a project and select Apply remote changes.
The Apply remote changes to work area dialog is displayed if there are changes found.
-
You can now select which tasks to apply changes for and inspect the changes. For each change, you can select which version to use, the remote version or the version in your work area.
-
Click Apply remote changes
You must add source connections and target connections if the changes include new data tasks.
Committing and pushing
You can commit and push your changes to the branch. As remote changes that are not applied to your work area can be overwritten, you should perform Apply remote changes before committing and pushing.
-
In Projects, click ... on a project and select Commit and push.
The Commit and push dialog is displayed if there are changes found.
-
Add a commit message that can help you keep track of what has changed.
-
Click Commit and push
Deleting a branch
You can delete a branch when you have merged your changes to the main branch.
-
In Projects, click ... on the branch to delete and select Delete branch.
You can select to delete the remote branch in version control as well.
If there are uncommitted changes you will need to confirm that these changes will be lost when deleting the branch.
Removing version control for a project
You can disconnect your project from version control. If there are existing branches they must be deleted before you can disconnect the project.
-
In Projects, click ... on the project to disconnect and select Disconnect from GitHub.
Sharing a project with other spaces or tenants
You can share a version of a project with a different space on the same tenant, or on another tenant. This is useful when you want to create two environments, for example one for development, and one for production.
-
Create a new project with same name as the original project.
Set the same use case and platform type. You can use different connections.
-
Connect the new project to the same repository and base directory path as the original project.
Information noteDo not select the Commit and push option. -
If the platform connection points to the same target as the original project, make sure you change the database or schemas of the data tasks. One way to change the schemas of all tasks is to change Prefix for all schemas in Metadata project settings before applying remote changes. This ensures that all tasks are created with this prefix.
-
Apply remote changes, selecting all files.
-
Add missing source connections for all landing tasks.
Click Select source data in the landing task, select the connection, and click Save.
Information noteThe connection must be from same type of source and point to tables with same names as in the original project. -
Add missing source and target connections for all replication tasks.
This creates another work area. You can make changes to the project in one of the work areas, and use Apply remote changes to sync the other work area.
Security considerations
Make sure that you maintain synchronized security configurations between Qlik Talend Data Integration and GitHub.
-
In Qlik Talend Data Integration permissions are based on spaces that may contain several projects. In GitHub, permissions are based on repositories that may also contain several projects. Best practice is to align them and connect all projects in one space to the same repository.
-
Keep in mind that Qlik Talend Data Integration and GitHub use different permissions and roles for users.
Best practices
Here are some best practices when working with projects using version control.
-
Add a README file that describes the repository in GitHub. For more information, see About READMEs.
-
Commit only projects that are valid and tested to run.
If you add projects with Landing or Registered data tasks that have not been prepared or transformed, the source columns are not yet included. Source columns are added when the task is prepared and transformed.
-
When you create a branch for replication projects, you should be aware that the branches use the same target by default. This means that running tasks in the branch can override data of the main version. To avoid data loss, change the target settings in the branch not to conflict with the main version.
-
When you create a branch, there may be remote changes that are not applied yet to the work area. Apply remote changes either before or after creating the branch, unless you want to discard the remote changes.
Limitations
-
It is not possible to disconnect or delete a project using version control if there are branches. The branches must be deleted before you can disconnect or delete the project.
-
It is not possible to rename a project using version control.
-
When you delete a tenant, objects stored with version control in GitHub will not be deleted. You need to delete these objects manually.
-
if a repository is used by many projects, or contains many files that are not stored in Qlik Talend Data Integration, performance may degrade.
-
Database schemas and connections are not maintained in version control.
-
GitHub fine-grained personal access tokens are not supported.
-
A branch (except the main branch) can be used by only one project per repository, even if two projects are in the same repository.
-
It is not possible to apply remote changes from a project that uses a different target platform. For example, if you commit changes to a project on Snowflake, you cannot apply the changes to a Databricks project.