Methodological approaches to identifying and mapping of fields of specific crops on a basis of high-resolution satellite images using phenological, geographic and regional statistical information

The paper contains the description of the methods which could be used for mapping of crops of interest on a basis of qualitative information, extracted from satellite images - image interpretation signs, like color or tone, used in conjunction with geographical, phenological, ecological data and regional statistical information. Three crops: lavender, almonds and barley were selected as crops of interest.


Introduction
The literature describes in detail methods for mapping and estimating yields of the most common and popular crops using remote sensing data with a lack or absence of field data to calibrate models or validate generated maps. Nowadays methods for detecting and identifying of fields and plantings of different crops using knowledge about phenological phenomena that are unique for a particular crop, which are reflected in the difference of the seasonal variation curves of vegetation indices (for example, NDVI or EVI), calculated using time-series of medium and high-resolution satellite data, have shown quite high efficiency. For example, a number of previous studies, e.g. [1][2][3], have shown that such differences are clearly observed for winter and spring crops. To supplement information about the onset of a specific phenological phase, which is useful for more accurate identification of the certain crop plantings, especially between winter crops, as well as subsequent yield assessments, information from the crop calendar for many years or for the current season is widely used, as well as Growing Degree Days -an important agroclimatic indicator characterizing the limits of favorable thermal conditions for the development of plants during the season. The most popular and commonly mapped vegetation cover objects in agricultural research are such important food crops as wheat, maize, sunflower, and rapeseed. For example, Roumenina [4][5][6][7]. By the end of the second decade of the 21st century, satellite monitoring and agricultural mapping capabilities had improved significantly with the launch of the Sentinel-1 and 2 satellites, providing unique data in the optical and radar ranges, with revisit gaps between on average 3-6 days and high resolution (up to 10 m), which today made it possible to accumulate significant volumes of data, allowing to analyze the condition and assess the yield of the most important crops.
At the same time, there are a number of difficulties that complicate the process of mapping agricultural crop fields (as well as many other types of vegetation cover) in an arbitrarily chosen territory for both past and current agronomic seasons. This is often associated with a lack or difficulty in collecting field data with information on the presence and characteristics of the growing season of a specific crop in a certain area at a particular time interval, combined with factors such as annual crop rotation, young age of trees, small planting size. Collecting such data remains a laborious and expensive task, in particular in areas with poor transport and road infrastructure, and therefore one of the most important tasks in land cover mapping is to improve the efficiency of the field data collection phase with minimal costs or search for alternative sources of relevant information about the crop growing in a specific region. Other difficulties are related to remote-sensing data and are derived from the lack of suitable cloud-free and very highresolution images, the lack of scientific and statistical information, and the general poor knowledge of the vegetation cover of the study area.
Besides, despite the increasing everyday volume of high-resolution remote sensing data and information resources that provide up-to-date and detailed statistical data, a perceptible lack of highly specialized researches in the field of thematic mapping of vegetation cover and agricultural crops using satellite data, during which, in addition to the limited set of vegetation indices, traditional interpretation signs, containing qualitative knowledge about the objects are used, still takes place. These signs include such features, representing spectral information, as color, tone, shape, texture and dimensions of objects [8][9][10], which make it possible to visually identify the fields of some less widely-spread crops without the stage of collecting field information or can be helpful in order to minimize its duration.

Subjects of crop detection and mapping
Several crops used widely in the food and perfumery industries, belonging to different families (Lamiaceae, Rosaceae, Poaceae) and representing different plant life-forms were selected as test crops for mapping in order to define the main object interpretation signs and characteristics that contribute to confident identification of the places of cultivation of such crops: lavender (Lavandula angustifolia) in Bulgaria and France, almonds (Prunus dulcis) in the USA and Morocco, and winter barley (Hordeum vulgare) in Russia. These crops were chosen due to their lower popularity and total area occupied, in comparison to the world's most widespread and similar crops, as well as due to the specific appearance of plants, in particular, lavender, which should contribute to their more confident identification in areas of intensive agriculture.

Materials
The main sources of information and statistical input data for the analysis were: 1. Reports of state ministries of agriculture and commercial analytical organizations, research articles, web publications and fiction on the beginning and end of the harvest; maps containing information on the presence of this crop in the specific regions, their crop specialization, shape and the size of the fields; 2. Reports containing information on the areas occupied by each crop by administrative unit and their annual dynamics, published by statistical services of ministries and analytical organizations, thematic geographical maps, in particular, maps of distribution areas and maps of the ecological suitability of territories for growing these crops, depending on the set of limiting factors; 3. Reference information: websites and addresses of organizations involved in the production and marketing of the relevant agricultural products, as well as travel brochures containing routes for the regions where these crops are grown. 4. Internet blogs, user photos and videos which allow one to get an idea of the shape, size, color and exact location of fields and orchards in a given area, the timing of the beginning of different phenological phases, facilitating the process of identifying fields in satellite images. 5. Direct search of farms, fields and plantings on Google maps and other mapping services (Esri, Bing, Mapbox, OpenStreetMap) using published crop distribution maps with clarification of information near settlements using the Google Street View service. A set of features that make it possible to identify fields of a specific crop in a satellite image All information that makes it possible to determine the presence of a specific crop in a particular area using satellite images can generally be attributed to two groups -information on the physical properties of an object (biometric: height and shape of plants, seasonal changes in their colors; shape and size of plantings) and statistical information (the area of the territory occupied by a specific plant species in this region and its dynamics; approximate harvest dates for different crops, etc.). Thus, the appearance of the plant, the color, shape and size of the fields are the main primary characteristics that are reflected in the satellite image and contribute to the visual identification of agricultural fields and their distinction from natural vegetation, and at the same time statistical data supplement this information, specifying the growing area and season periods for search fields.
The main visual feature that simplifies the task of identification of fields and plantings occupied by a particular crop on the satellite image is color, which is characterized by the aspect in terms of phenology. The aspect during the flowering period is one of the most effective indicators which could be monitored using satellite images of medium and high resolution [11]. One of the efficient ways to accurately identify a crop can be to compare the spectral brightness values in the image and digital photographs in the R, G and B bands of the visible or infrared ranges, taken in the field on dates close to each other or on identical time periods of different years with the same sun illumination conditions (the same time of day and weather) and by the same phenological phase of the crop, which together result to similar field colors on the satellite images. For more accurate analysis, it is best to choose images that cover the stages of flowering, wilting, harvesting, etc.
To facilitate the task of effective mapping of fields of a specific crop, it is important to understand the patterns of changing geographical and ecological conditions from area to area, which determine the phenology of each specific crop, as well as historical, cultural and economic factors that determine the popularity of a particular crop in different regions or countries, depending on the traditions of agriculture that have developed here historically. These factors can be manifested in individual variations in the appearance and dimensions of plants, different sizes, shapes, structures and colors of fields, as well as shifts in the timing of harvesting from region to region.
A promising approach to identifying perennial crops and distinguishing similar crops between each other can also be the analysis of differences in the color of plantings in different years of the plant development depending on age. On the one hand, on satellite images that cover only the period when plants are juvenile, this can make it difficult to identify such plantings, but on the other hand, when using time-series images over many years, it makes it possible to evaluate how the color of plantings of different crops changes from year to year, which contributes to the distinguishing of plantings of similar crops. Additionally, the graphs of annual changes and peak values of vegetation indices for several seasons can be analyzed.
For example, Chen, Jin and Brown (2019) used Landsat time series data to determine the planting year and stage of development of almond trees based on the annual maximum values of the NDVI index [12].

Visual identification of lavender fields in Bulgaria
The first crop selected for mapping was lavender. To search for lavender fields using satellite imagery, several areas in Bulgaria and France were selected, as these are the world's leading countries in the cultivation of lavender for perfumery.
A telling source of information about areas of lavender cultivation is a map, demonstrating spatial differences in the patterns of distribution of fields occupied by this crop in a given territory. Using the maps of areas, where lavender is cultivated in Bulgaria (Fig. 1), it's possible to find out which regions are leaders in lavender production, estimate the share of areas occupied by these fields in relation to the area of all agricultural land, it is also possible to analyze the dynamics of field areas over several years, which will also be reflected in the time series of satellite images through the emergence of new fields or vice versa, replacing lavender with other crops. Therefore, statistical information is an important source of data that makes it possible to track changes in areas relative to other crops in space and time, which is reflected in satellite images of different years and contributes to an increase in the reliability of field identification. As we can see, the leading regions of Bulgaria in the cultivation of lavender in 2018 were the north-east located Dobrich province, in which the area of lavender increased by more than 1300 hectares over the year (about 30%), making up almost half of the total area under this crop in the country, and Stara Zagora province (central part) with 2040 hectares, in which the area of lavender fields, on the contrary, has decreased over the year by more than 100 hectares. Thus, knowing the total area occupied by lavender plantings in these areas in a particular year, it becomes possible to set the task of finding all fields in the images and thereby find out how effective a particular feature is for their identification. At the same time, to search for new fields of lavender, to which the total area of plantings increases, as mentioned above, it is useful to determine the characteristics of young plants related to dimensions, size and its shapes, and how they can manifest themselves on satellite images. In this case, to reliably identify new plantings, it could be useful to use very high-resolution satellite imagery for small areas of the area to find some fine details such as row structure, row spacing and other features that are not visible in the lower resolution imagery.

Visual identification of lavender fields in France
To identify lavender fields in France, maps of tourist routes in lavender-growing areas in Provence were used (Fig. 5), along with user photographs (Fig. 6) and Google Street View, which used to pinpoint the location of the fields on a high-resolution Google Earth basemap (Fig. 7).   The colors of lavender fields in both satellite images and photographs of lavender fields in Bulgaria (Fig. 2, 3 and 4) are visually close to each other, which gives reason to interpret these fields with a high accuracy as lavender. And although during the entire period of lavender flowering, the color of these fields changes from violet to purple, on some dates it is possible to find many similar areas. The average value of the color intensity according to the RGB color model, calculated in a 3x3 pixel cell, corresponding to a part of a lavender flower taken from one of the rows in the photograph, is 175, 179 and 170, respectively, which produces a magenta/purple color. In the Google image, the average R, G, B values calculated on a randomly selected lavender field are 95, 73, 75, and in the Sentinel-2 image -100, 65, 65, that is, these values also make up very close shades of magenta.
The color of lavender fields in satellite images for France visually differs significantly from the color in the photograph: for example, in the Sentinel-2 image, it is darker, with a predominance of gray shades, while in the Google image, it is slightly lighter than on the Sentinel-2 ( Fig. 7 and 8). The Google image clearly shows that the color of the fields is different and not uniform, with bare soil visible between the individual rows of lavender. In general, the prevalence of «dirtier» green-gray colors of lavender fields in the Sentinel-2 image for France compared to the purple colors of the fields in Bulgaria can probably be attributed to the mixing of purple and green colors during the harvest season. This may be due to the local traditions of lavender cultivation, which manifests itself in the selection of lavender varieties suitable for local climatic and soil conditions, as well as the larger average field size and possibly wider row spacings compared to ones in Bulgaria (by simple GIS-based measurements we found that the maximum area of an individual lavender field in the Dobrich province of Bulgaria does not exceed 23 hectares, while in Provence many fields have an area of more than 30 hectares).

Visual identification of almond orchards in the USA and Morocco
Almonds were selected as a second test crop to understand the problems of tree crop plantings mapping. The location of the almond orchards was taken from advertisements and catalogs for the sale of the orchards in the United States. As we can see, the color of almond orchards during the bloom period is also visually similar in the photograph (Fig. 9) and satellite images ( Fig. 10 and 11). However, the color of blooming mature almond trees cannot be the only feature that allows the identification of almond orchards of all ages. Since almonds begin to bloom and bear fruit at a relatively late age, only a few years after their planting, together with mature trees it is also necessary to search for all new plantings of young trees using statistical information. That means that by knowing the total area of almond plantings in any territory, we can try to find all plantings of both young and adult trees of all possible varieties. So, for example, total area bearing and non-bearing almonds in California in 2020 amounted to 1.6 million acres, while in the previous season (2019) it was 1.52 million acres [18], that is, it grew by 80,000 acres. At the same time, the area of non-bearing almonds has increased by more than 10,000 acres compared to 2019. The methodology for identification of young non-bearing almonds plantings implies the use of a different method based on the vegetation indices, which will be discussed in a special section of this paper.
The approach supposed above was applied also for the visual identification of almond orchards in Morocco -one of the world leading countries in the commercial almond production [19], which was selected arbitrarily as an area, for which it was available for a public access only the information about the exact locations of the several largest in area almond farms, photos of the orchards and dates of approximate bloom period [20]. Finding out the approximate timing of almond bloom in Morocco (from mid-February to March) made it possible to determine the area occupied by its orchards near the known places of its cultivation. The "striped" texture of the objects, which were identified at the image as crop plantings (Fig. 12), also helps to conclude that these objects are orchards of trees.

Visual identification of winter barley fields
Another effective approach for the identification and recognition of fields occupied by specific crop, especially annual crops, is to use the reliable information on the dates of the beginning and end of the harvest and information on the proportion of fields harvested by dates within the harvesting, which also contribute to distinguishing between fields of similar crops, for example, wheat and barley, which often occupy the largest areas among all cereals. For example, today in the largest countries of Europe -Russia and Ukraine -the main cereal crop is wheat, the area occupied by which during the last several years was about 8-10 times larger than that of barley. The beginning of winter barley harvest usually happens 7-10 days earlier than winter wheat, which is predetermined by earlier ripening of barley -its growing season lasts 230-300 days, which is on average 6-10 days shorter than that of winter. On satellite images, these differences are manifested in a sharp change in the color of some of the fields within a short period of time from moderate brown to light brown, which indicates the beginning of the barley harvest, and some time after the end of the harvest, all these fields turn dark brown (color of the soil), which means plowing the fields after clearing from straw to prepare them for the new season.
As an example, we offer to analyze the dynamics of harvested areas of winter barley during 2018-2019 season in Krasnodar Krai, Russia. According to the dates of publication of articles in local online newspapers, the stages of the beginning, middle, end of the harvest, as well as the area of the harvested areas on the corresponding dates, were quite clearly traced. For example, on June 10, 2019, the "VK Press news" agency announced the upcoming beginning of the barley harvest [21], and on June 28, 2019, an article was published in "Delovaya Gazeta.Yug", which announced the completion of the barley harvest on 100% of the sown area [22], that is, 130,500 ha (see table).

Date
Winter barley Winter wheat 10 Below are the images showing winter barley and winter wheat fields in the Krasnodar Krai in the second and third decades of June on the dates corresponding to the beginning (Fig. 13) and the end of barley harvesting (Fig. 14).  To improve the efficiency of the considered approach, more frequent high-resolution time-series images are required, for example, on a daily basis. To increase the reliability of field detection, it is also necessary to have additional data on all varieties of barley grown in a given territory and the differences in their vegetation (growing degree days, precipitation requirements, etc.).

Additional steps. Auxiliary calculated vegetation indices
Often, plantings of the crop of interest need to be distinguished among the more important and popular crops or when it is necessary to determine the age of planting, if it is a perennial crop, in order to distinguish it from an annual crop. In this case, there are often not enough single images, and it becomes necessary to detect fields using additional methods based on long time series satellite images. Vegetation index analysis is one of the widely used approaches to solving the problem of distinguishing one crop from another. When comparing the curves of vegetation indices, it is also necessary to take into account the ecological growth conditions of each crop and its individual varieties in a given territory, which is manifested in the difference in the dates of the onset of each subsequent phase of plant development, causing differences in the time of the beginning and end of the harvest. These data must be compared with statistical information, as well as with the percentage of the area occupied by the crop of interest versus the entire area of the region and "secondary" crop in the area (e.g. winter barley vs winter wheat), as well as the dynamics of the area of fields in different years.
As an example, the average per field NDVI values, derived from Sentinel-2 images, were calculated for several almond orchards -for bearing (Fig. 15) and non-bearing orchards (Fig. 16). The graph above is a curve having a unique shape for bearing almond orchards, and the segments of the graph, in which the index has maximum and minimum absolute values, represent the dates of the onset of a particular annual phenological phase (for example, dormancy from November to February or blooming from February to mid-March). It can be seen from the second graph that the maximum annual absolute values of NDVI on this orchard tend to increase each year from the smallest values (0.2-0.3) to about 0.7, which means that a tree perennial crop is growing in this field, the total biomass of trees tends to increase every year, which would be significantly different from the NDVI curves for fields of annual crops, on which there is a constant crop rotation, with an annual change in the shape of the curves. In general, to solve the problem of detecting fields of perennial crops, an important step is to search for areas where such changes have occurred, based on a clear change in the color of the field in the spring-summer period, which can occur due to the regular change of annual crops in the process of crop rotation or complete plowing of the field and planting of perennial crops. And to distinguish the field of perennial crops from fields of similar crops, it is useful to use satellite images with a large number of spectral channels.
For detection of a specific annual crop growth stage or a phenological phase, more specialized indices are often used that can characterize the phenology of different crops. For example, Chen, Jin, Brown (2019) used the Enhanced Bloom Index to study the phenology of almonds in California [23].

Results and Discussion
After completing the identification of fields in any part of the territory that can cover the entire administrative unit, they can be used as a set of training samples for one of the algorithms that use machine learning (ML) methods for an automatic identification of the fields. As an experiment, we made an attempt to identify visually on the satellite images and map the lavender fields in Bulgaria's leading lavender cultivation region -Dobrich province -at 2019 year, and we managed to detect more than 2500 hectares of fields using only Sentinel-2 satellite images. The example of lavender shows the possibility of visual identifying at least 80% of the fields using only Sentinel-2 images on a basis of a "true color" band combination, with an increasing the accuracy to more than 90% of fields if to use additionally a set of very highresolution images -at least in a small area with relatively uniform geographic conditions. We plan to develop and test in our future studies some automatic models for a detection and identifying of the crop fields on wider areas and geographic regions based on different ML algorithms.
The approaches presented above does not allow detecting all existing crops in an arbitrary chosen territory at a randomly selected time period due to many factors, including the small size of the fields, the inaccessibility of very high-resolution images, high cloudiness, which does not allow determining the flowering window and capture the dates of sowing and harvest. It is also useful to perform primary categorizing of crops according to the difficulty of their detection ascrops, simple for detection (popular crop, large area and amount of data, frequent field surveys), crops with a medium detection difficulty (large set of statistical data and a small choice of available satellite images for key dates and vice versa) and crops, which are difficult for detection (small size of fields, annual crops, lack of field and statistical data and a small set of available images for the region of interest). In this regard, a very important stage is the collection of highquality, significant and reliable statistical information on the location and area of fields, including online information and data in different languages, as well as the organization of a convenient format for the exchange of such information between local residents and persons, conducting the research. It is also necessary to perform a primary analysis of the possibility of mapping crop fields based on very high-resolution images which are used by the most popular web mapping platforms.

Directions for further research and open questions
Although the approaches, discussed at the paper, provide a lot of vital information for accurate identification of the fields of many crops, they are still not enough from being exhaustive for detection of any crop of interest. Below are additional approaches and sources of information that can be used to improve the accuracy of identification of the crop fields.
1. The use of the images taken with a greater frequency (for example, collected daily) makes it possible to more clearly determine the timing of the onset of a particular phenological phase and at the same time makes it possible to select images that are closest to the key dates with minimal cloudiness. 2. The simultaneous use of satellite images of different spectral range, bandwidth and spatial resolution can provide additional opportunities for more accurate identification of the fields on which a particular crop is grown. 3. The use of accurate data about the features of the seasonal development of plants in each specific region of their cultivation, depending on the environmental conditions that determine the differences in color, shape, size and structure of fields and plantings, the appearance of the plants, main varieties used, as well as approximate dates of the onset of one or another phenological phase. 4. Searching for photos and videos that simplify identification of the fields of a specific crop in a specific area and allow you to assess the current condition of the fields and harvest. More suitable for perennial crops. 5. Searching for organizations and establishing a convenient format for exchanging relevant information with local farmers and researchers, detailed planning, increasing labor productivity and automating the process of field work if local residents are involved. 6. Searching for detailed statistical data on the dynamics of the area occupied by a particular crop, published annually by government agencies. Mainly suitable for perennial crops. 7. Searching for information on the timing of the beginning of the harvest at the level of small administrative units (for example, a province in Bulgaria or krai in Russia), which is available in the public domain and can be obtained from online newspapers.
8. Usage the same approaches for regions, similar to each other by geographical and ecological conditions, taking into account local specifics of every region, and differ from each other by the availability of geographical and field data for free access, so it is extremely important to have a bit of information about location of fields at least for one of the compared regions. 9. Search, creation and use of libraries of spectral and phenological curves, allowing to identify fields and plantings of crops of interest on the areas, which are similar to each other by geographical and ecological conditions in order to use these data together with very high spectral resolution data (hyperspectral imagery). 10. Organization of the field verification stage to assess the overall reliability of detection and identification of the crops and fields. 11. How can we improve the accuracy of crop mapping when there are a lot of images captured at cloudy dates only available? How efficient will radar data be compared to optical data? 12. Is it possible to identify fields and estimate yields of less popular annual crops in near real time (for the current season), or is this only possible over the past few years? 13. What are the further actions in case of lack or absence of scientific and statistical information on any territory and up-to-date high-resolution images?

Conclusion
The paper describes several methodological approaches to map the fields of crops from satellite images using phenological, geographic, ecological and statistical information in addition to the traditional approaches based on vegetation indices. On the example of three crops (lavender, almond, winter barley), it was shown that the color in the satellite image is an important feature that allows identification of fields of the specific crop among other types of land cover, in particular in areas where similar and more popular crops are cultivated. Statistical information, i.e., the area occupied by the crop fields or sown area in different years; growing season; the start and end dates of the harvest, when used together with the color of the fields, make it possible to visually distinguish the fields of the mapped crop from other vegetation cover types and fields of similar crops to be observed, and, in addition, validate the accuracy of the mapping of the fields. As the example, we performed an attempt to visually identify and map all lavender fields using satellite imagery in Bulgaria's leading lavender cultivation region -Dobrich province -and found out that the accuracy was about 80% using only Sentinel-2 images which increased to more than 90% when we included the set of very high-resolution images, deriving from the basemaps of the most popular web mapping platforms into the workflow. The obtained data about the color changes of the fields of the mapped crop during the growing season provide valuable information for searching the fields of the crop of interest in areas with similar geographic or ecological conditions, but different in terms of availability of geographic data related to the mapping of crop of interest. We found out that the information about the field color, statistical, geographic and ecological information, and the long time-series data, e.g. curves of vegetation indices (for example, seasonal and/or long-term changes of NDVI values) when used together, can significantly improve the accuracy of crop mapping; in addition, we identified the range of other useful directions, approaches and questions that can further improve the process of mapping the fields of crop of interest.