Greenhouse Canada

Features Energy Environmental Control Heating Ventilation
Questions in greenhouse data

Whether electronic or manual, every greenhouse collects data. But what’s worth keeping and how does one make full use of it?

August 11, 2020
By Q&A with Pieter Kwakernaak, Hoogendoorn America

Pieter Kwakernaak

As the high-tech world of greenhouse operations continues to move forward, growers are faced with increasingly more options for sensors used to monitor and streamline their operations. Now with greater opportunities for data collection and analysis than ever before, how can growers figure out which ones are important and what to do with it all? Greenhouse Canada sat down with Pieter Kwakernaak, general manager of Hoogendoorn America, to get his take on greenhouse data and its role in climate control and optimal plant growth.

Why has greenhouse data become so essential in greenhouse climate control?
Greenhouse climate data has always been very important. However, not everyone is looking at all the data because it can be overwhelming. Sometimes it is hard to tell what exactly is important. That is why this information has to be collected, analyzed and displayed in a way so growers can see what exactly is happening at the crop level. The greenhouse climate is created to let plants grow as best as they can.

What type of data should growers collect? Why do they need such a wide array?
In general, growers should look at data that directly supports plant growth (sensor-generated data, crop measurements, etc). In a modern commercial greenhouse, different sensors can generate data. For instance, aspirator boxes and sensors measure temperature, humidity, CO2 and PAR light, while meteo sensors measure the weather conditions outside. Additionally, there may be cameras and weighing scales. The current generation of climate computers also offers large amounts of data about actuators such as ventilation windows, screens, irrigation valves, etc. Plant sensors and crop measurement systems provide data about crop development, yield, and quality.


All sensors in greenhouses are installed for a reason. It is important to look at the greenhouse climate conditions. However, controlling the climate with the goal of supporting plant health is a better direction rather than trying to control indoor conditions, such as with setpoints for temperature or humidity, in general.

A good starting point would be Growing by Plant Empowerment, which is a way of sustainable growing in greenhouses by supporting plant balances in water, energy and assimilates. Continuously collecting data during the cultivation process provides insights into how the greenhouse and plants function. By applying the new insights and principles of Plant Empowerment, many growers are already benefitting from improved results. They observe better growth and production, fewer pests and diseases and save water and energy at the same time.

With so many different types of data being collected, how can a grower make sense of it all? How can the analysis be used to improve production?
Data analysis starts with the process of data collection, data storage, and data cleaning. A high-quality dataset creates many possibilities for data analysis, like answering questions such as what happened and why did it happen? Visualizing data with data analysis can be essential in finding the correct answers. When combining data from multiple sources, data analysis tools such as dashboards or graphs are essential to present results in a clear way and gain insights to your cultivation data. Also, calculations on real-time data can be performed to deliver even more insight. This way, data is turned into meaningful information.

In order to drive plant growth, the plant has to be in balance. To achieve this, we have to look at the three plant balances:

  • Assimilates balance is directly related to photosynthesis, the starting point of plant growth. This process is described chemically as 6CO2 + 6H2O + light = 6 O2 + C6H12O6. Known as the production of assimilates, this is needed to build plant load.
  • Energy balance is the balance between the energy flow towards (input) and from (output) the plant. Energy input can be from radiation (by sunlight and lamps), radiant heat (pipe rail, heaters), convective energy from air movement and evaporative energy.
  • Water balance is the balance between input to and output of water from the plants; a balance between irrigation and the uptake of water with the loss of water through evaporation and growth.

Collecting the right data can give information about whether your crop is in balance. A balanced crop makes more efficient use of water, fertilizers and energy. That is how Growing by Plant Empowerment contributes to sustainable growing. In addition to creating the optimum growth climate, resulting in optimized yields and quality, you can also save costs.

Modern greenhouses now house a multitude of data-generating sources. Photo credit: Hoogendoorn

For parts of Canada, the greenhouse can face bouts of warm and humid conditions in the summer. How could a grower use their historical data, combined with AI algorithm, to help them optimize their greenhouse climate for the good of the crop?
The behaviour of a crop is not only determined by the actual greenhouse conditions, but also on many other factors resulting, amongst others, from the crop’s history. To describe these complex and dynamic systems, we need the help of dynamic algorithms and models, based on AI techniques. In general, artificial intelligence tries to mimic human intelligence. During the training process, the network gradually learns how the real crop behaves, and eventually, the network is capable of simulating this behaviour when it is fed with new input data. In other words, the trained network has become a model of the real crop.

Looking at plant production systems, we can think of different models that describe growth and development, predict yield and quality, calculate expected energy and water demand, and estimate the risks of pests and diseases. Different models together form a decision support system that is capable of answering the questions that growers like to ask, such as: what will happen if I maintain my current strategy? And what if I change this specific parameter? And what is likely to happen this upcoming week based on the current weather forecast? A yield prediction model, for example, can provide useful information to the grower on the number of kilos that he is likely to produce in the next couple of weeks. This is very important to conclude profitable contracts with customers. An energy demand model can help plan more accurately in the purchase of gas and electricity, and also base it on the weather forecast.

An AI model needs time to learn from historical data. However, by analyzing the data and making adjustments based on keeping the plant in balance, a grower can benefit from data analysis right away.

Is there such a thing as bad quality data?
Absolutely. The data collected should be correct since growers are making decisions based on this data. Data quality needs to be of a sufficient level, because of the rule ‘garbage in = garbage out.’

It is important to make sure crop measurements are consistent and that sensors are checked and calibrated on a regular basis. Follow the recommended maintenance schedule from suppliers.

Before a dataset can be used for analysis, the data must be cleaned. The idea of data cleaning is simple; it is the process of removing outliers. Even a calibrated sensor can break down and cause errors in the dataset that should not be taken into account when calculating descriptive statistics, such as the mean value. Data cleaning is required to improve the quality of the dataset. This can be done with all kind of tools, scripts or algorithms. It’s important to think about which data is needed for the analysis.