Databases


Handling masses of data across multiple sites requires having to store the huge amount of data in database servers. It is important to maintain a more structured database to ensure that the data can be stored, accessed, transferred around and will be retrievable all throughout the production within different (global) sites.

Much of the assets and shots stored can be accessed and published into an internal database system. The structure of the database follows the stem and leaf principles in which allows the artists to navigate and find specific data that is queried. This system can locate and retrieve data from specific shows and shots within the stem and the submissions/tasks performed by artists inside the stem. Versions and WIP (work-in-progress) playblasts of the files can be found inside the set tasks. These tasks either gets approved, made available or declined to ensure that the files being passed onto the pipeline is at a standard of quality.


DATA STRUCTURE CLASSIFICATION

  1. STRUCTURED DATA
    • data that is stored in an ordered manner through a relational database management systems.
  2. UNSTRUCTURED DATA
    • data that has no clear format in storage which in the past had no real way of analysing

We have a huge internal asset management system (AMS) in DNEG that stores publishes from artists that work on a particular asset or shot within a show. The AMS has a hierarchal standardised way of registering files and is accessible throughout the pipeline. Splitting database by show can help with load balancing and can upgrade or lockdown shows independently from each each which is critical during delivery periods. The database is maintained on a weekly basis which will reset and update the database.

Shotgun is third-party web-based application application that is used to track data for a show and review asset/shot work. The AMS and Shotgun database systems are able to communicate and sync through information which allows users to access from both applications. When production coordinators link you to the show it will grant you access to the show folder in the AMS and also the show’s Shotgun page. This is to protect client work and ensure that only people that are working on the show are given permission.

There were a few site transfers that I had to make in order to pick up dependency rig files from Mumbai which were needed to be synced. The file was migrated over across to the London site in a matter of seconds.

RELATIONAL DATABASE

Relational databases consists a series of tables of structured data. Uses multiple tables and rows (records) in a table which refer to primary keys (unique-per-row values) from other tables that enable complexity and efficiency that might not be possible in a traditional two-dimensional database.

Different tables are created for different object types. In the context of VFX work, for example, there might be a table for shots, assets and asset versions where the asset table might include a number of different items that are being worked on for a specific shot. Assets would have a reference to the shot that they belong to based on the primary key from the shot table.

For QC’ing (quality checks) specific assets I needed to create a breakdown of the shot context where I can link different assets onto the asset stack so that when loaded onto the shot sequence scene file on Maya it will directly link to the update to the latest non-declined version of the asset. Various of data sets from shows are manipulated and linked through by production assistants/coordinators. They are assigned to specific shots which are imported in to Shotgun.

DATABASE QUERIES

When requesting data from a database you’ll need to run a database query. Queries can perform many functions such as retrieving a specific information from the tables. Data might be spread across a whole scope of different table. Queries give you the ability to view them in a single spreadsheet.

Shotgun, which is the project management tool that is used in DNEG, can provide a certain amount of functions that enable developers and TDs to construct queries when abstracting with the data through the Shotgun API. Deleting and editing data is safer in that you wouldn’t need to interact the MySQL and PostgresSQL servers.

Below is an example of a shotgun query that returns a list of dictionaries containing the name attribute of all projects that correspond to the sg_status attribute set to “active”:

from shotgun.common import conn
active_shows = conn.shn.find(
   "Project",
   [["sg_status","is","Active"]],
   fields = ["name", "sg_status"]
)
[
   {
      ’sg_status’: ’Active’,
      ’type’: ’Project’,
      ’id’: 2,
      ’name’: ’SHOW_1’
   },
  {
      ’sg_status’: ’Active’,
      ’type’: ’Project’,
      ’id’: 54,
      ’name’: ’SHOW_2’
   },
   # etc...
]

One of the tasks that I had to assist was checking if any models within the rig sessions available online were up to date. I ran a query that goes through the show directory that contains a rig task object and looks for the latest version and make sure that it is not declined.

PRESERVING DATA QUALITY

The VFX workflow would require the artist to somewhat manipulate the data when adding the animated characters and set extensions which will be lit and composited to be integrated seamlessly with the client’s footage which means retaining the quality of the source material.

Validation

It is extremely vital that the data inputted into the database is validated. Ensuring that the data of the correct type and format is entered so that it can be sorted and will be able to fit the correct context. For example listing the date of birth could be stored in a variety of ways. Having a consistent format makes it easier to order and query data.

Verification

Another important practice when entering data into a database is verification making sure that the correct data is entered. Data can either be manually inputted or through an automated process which can be inputted incorrectly. A way to correct human error is using double entry verification which will initiate the user to enter the data twice.