Mapping the Evolutionary Pattern of Mobile Products: A Phylogenetic Approach

Product evolution is an emerging research field with increasing academic and practical interest. Although there have been several qualitative and quantitative studies on product evolution, a consensus on operational definitions and generalized methodology for analyzing product evolution has not been reached. Therefore, operational definitions of product gene, product taxon, and ancestor–descendant relationship are presented in this article. Based on the definitions, a unique product phylogenetic tree methodology is introduced considering the differences in biology and products. The methodology can be applied to any product data because it comprises generalized algorithms. To verify the definitions and methodology, a mobile product phylogenetic tree is constructed using the web crawled global mobile product data released between 1995 and 2019. The definitions and methodology were verified by observing historical events, such as emergence of smartphone and growth of Chinese firms on the resulting tree. In addition, three application methods of the methodology are introduced—a tool to explain evolutionary phenomena in a product along with previous product evolution theory, a tool to inform firms of decision making for determining their strategies, and a tool to predict the future product type. In summary, the findings of this study can contribute to the improvement of product evolution research by introducing data-driven and algorithmic approach for analyzing product that can be utilized in future studies. Furthermore, the proposed methodology aids decision making in product development by providing trajectories of products that are expected to survive.

and consume it by fire even as man consumes it; it supports its combustion by air as man supports it; it has a pulse and circulation as man has" [1].In Erewhon, machines, such as steam engines are treated as living beings, and they are improved using the approach of leaving favorable functional modules for survival and remove functional modules (although they are not mentioned as evolving in the novel, the process resembles evolution).Hence, the possibility of explaining technological change using an evolutionary process was suggested approximately 100 years ago.Based on this possibility, researchers have actively applied the concept of evolution not only to technology, but also to economy, culture, and products [2], [3], [4], [5], [6], [7], [8].
Recently, the interest in product evolution research as a related field of technology evolution has increased considerably.This is because a product corresponds to a set of technologies [9], [10], [11] and researchers intend to investigate principles in product evolution, such as product survival in the market [12], emergence of successful new-products [12], [13], [14], variation of product characteristics that cause industrial transitions [15], and coevolution between products and socioeconomic factors [16], [17].Several firms and researchers have focused on product evolution to establish appropriate product development strategies [12], [18], [19], [20], [21].
Several qualitative case studies have been conducted [7], [14], [19], [22], [23], and these approaches are important because they indicate the possibility of understanding product evolution due to the commonality between biological and product evolution.However, there is a limitation on the results, and the interpretation can be changed depending on the type of product or insight of researchers.
In addition to the qualitative studies, there are other studies that use quantitative approaches for analyzing evolutionary patterns in products.Although these studies provide an empirical basis for product evolution, different definitions of components are used.For instance, some studies define a product as a representative technology among the technologies that constitutes the product [10], [24], while others define a product as a set of technical characteristics [16], [25], [26].The differences in definition cause differences in data, measurements, and methodologies that are used to analyze the evolutionary phenomena of product, making it difficult to develop a unique methodology for product evolution.
As an example of the unique methodology problem, previous studies have described evolutionary pattern of products by applying phylogenetic tree methodology [27], [28], [29].These studies scientifically describe evolutionary patterns of 0018-9391 © 2022 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
products through methodology in biology.However, a limitation of these studies is that they cannot reflect the unique property of product evolution because they just applied the tree algorithm in biology to the product data.The reason why establishing a unique methodology for a product phylogenetic tree considering differences between products and biology is difficult that there are no operational definitions of the components that constitute the tree.
To address the methodological problem for product evolution, in this study, operational definitions for components in product evolution based on previous literature are established.Further, a unique algorithm for a product phylogenetic tree based on the definitions is introduced.The product phylogenetic tree introduced in this study can scientifically describe the overall evolutionary process of a product and can be used as a generalized methodology that can be applied to various product data.To verify the accuracy of the operational definitions and methodology in this study, a mobile product phylogenetic tree is constructed and historical events of mobile product are observed in the resulting tree.
The definitions of the operational definitions and developing the methodology in this study can be the cornerstone of consistent development in product evolution research and suggests the possibility of development into a scientific field.In addition, the resulting product phylogenetic tree provides object patterns of product evolution.Based on the patterns, the firms can establish scientific evidence-based-decisions for their fate.
The remainder of this article is organized as follows.Section II reviews the previous literature.Section III presents operational definitions and introduces the generalized methodology as an algorithm for constructing a product phylogenetic tree.Section IV demonstrates the operational definitions and algorithms based on a case study of a mobile product.Section V provides an analysis of the resulting phylogenetic tree to confirm its validity and usability.Finally, Section VI discusses the significance of the findings as well as the scope for future work.

A. Qualitative Studies of Evolutionary Pattern in Product
To confirm the possibility that the development of a product can be understood based on an evolutionary approach, many qualitative studies have been conducted based on the commonality between product development and evolutionary patterns.
Charles Darwin defined evolution as a process of gradually changing from common ancestors shared by species, referred to as descent with modification.The principle of descent with modification is applicable to both the biological and product evolution.Products also evolve based on previous products [19].This evolving pattern can be referred to as descent with modification in product evolution [14], [23].Therefore, many previous literature on product evolution have qualitatively explained the development of a product by using descent with modification.For instance, Turbojet originated from Whittle W.1 [14], clothing washing machines originated from hand-and foot-driven models [19], modern fork originated from a two-tined fork [30], airplanes originated from the Wright brothers flyer [23], vehicles originated from Ford's Model T [23], and programming language, such as ALGOL, BASIC, and Python originated from Fortran [23], [31].
Speciation is also a representative evolutionary pattern that is commonly observed in product and biological evolution.In this pattern, one species that previously possessed genetic homogeneity is separated into a varied species with genetic heterogeneity.Through speciation, organisms become diverse [32].Products are also diversified through speciation.At first, a new product type is not an entirely new concept but a modification of previous products.However, after the divergence process, new product types possess completely distinctive characteristics from previous product types [19].Levinthal [7] defined the speciation in technology as a phenomenon wherein existing technologies are expanded to new domains through new market selection criteria and resource abundance.Based on the definition, he proposed the conceptual model for speciation.A few examples of speciation in the product are speciation from Radio to TV [19], Hertz's experiments to wireless telephony [7], and speciation of Polynesian canoes due to the isolated island environment [17].
The term "extinction" refers to the dying out of a species that is unable to respond appropriately to environmental changes.Similar to biological extinction, products also go extinct from their markets if they fail to meet consumer preferences [19], [33].This extinction is mainly observed when the product design converges into a new dominant design.This is because of the change in consumer preferences to new dominant designs, and products that differ from the dominant design do not meet the preferences [5], [16], [21], [34], [35], [36].A few examples of extinction in products are the stone axe, steam engine, and horse-drawn carriage [23].
There are additional patterns, such as horizontal gene transfer, which is a recombination of components from other categories [14]; exaptation, wherein the product is used for a different purpose than intended [22]; coevolution, which is the interaction between technologies in products [6], and Episodic change, wherein an explosive increase of products occurs in a short period [16].Table 1 shows the evolutionary pattern in product development.
Based on the qualitative studies, the possibility of understanding product development through an evolutionary approach can be confirmed.However, qualitative analysis can result in confusion.For instance, Levinthal explained rapid innovation in the wireless communication industry as "speciation," [7] whereas Wagner and Rosen explained it as an "episodic change" [23].Carignani et al. defined horizontal gene transfer as a recombination of components from different categories [14].By contrast, Wagner and Rosen stated that horizontal gene transfer occurred in the same category [23].This difference in opinions could be caused by a lack of empirical evidence to define evolutionary patterns of products as an objective criterion.

B. Definition of Product in Quantitative Studies
To ensure that the product evolution studies are objective and scientific, quantitative studies based on the product data are required.For quantitative research, two different definitions of the same product are commonly used.The first approach is to define the product as a representative technology [5], [10].For example, a freight locomotive is represented as tractive efforts, which can be measured in pound [10]; cement is represented as capacity of kiln [5]; and minicomputers are represented as CPU speed [5].Based on this definition, studies related to the cyclical patterns of technological changes [5] and parasitism in technology [10] were conducted.
The second definition of the product is based on Saviotti's Framework [9], [37], [38], wherein product is assumed to possess technical characteristics that are associated with service characteristics.In this framework, a tank is defined as a set of width, length, weight, engine power, range, road speed, armor thickness, and armament caliber [16], [25].Based on Saviotti's definition of the product, several quantitative studies have been conducted.For example, the greater the complexity of a product, the longer life of the product [12].The evolutionary trajectory of a product is affected by the social environment [16].Saviotti's definition is also used in related other research fields on products, such as hedonic price analysis [39], [40], [41], [42].For instance, Chwelos et al. [39] defined a PDA product as a set of battery life, volume, weight, display size, pixels, colors, total memory, clock speed; Dewenter et al. [41] defined a mobile product as a set of weight, volume, age, battery duration, ringtones, Bluetooth, MP3, and MMS.
The two definitions may cause differences in methodology for observing evolutionary patterns in products; for instance, in the first definition, evolutionary change in the product is only observed by measuring the change in the representative technology [5].However, in the second definition, dimensionality reduction [16], [25], [37] or social network analysis [18] are used for observing a change in a product as a set of technologies.
The different definitions of a product result in differences in methodologies, which affects the consistency in the analysis and interpretation of evolutionary patterns of products.In other words, a lack of consensus on the definition of a product hinders consistent improvement of the product evolution.

C. Previous Studies of Product Phylogenetic Trees
A phylogenetic tree is a network that represents the flow of evolution based on the similarity of genes in organisms and is commonly used to observe evolutionary phenomena [43].The tree consists of taxa as nodes and ancestor-descendant relationships between taxa as links.There are two types of tree; one is cladogram, which only represents the classification system of existing taxa, and the other is chronogram, which represents the classification system and the evolutionary times as branch lengths [44].
Researchers in biology have used phylogenetic trees to describe the evolutionary process to study the evolutionary problem [44].Similarly, to extract relevant information related to the product evolution, several studies have used the phylogenetic approach in product evolution.
For instance, O'Brien et al. [27] constructed an archaeological artifacts phylogenetic tree for resolving archaeological problems in the southeastern US.The tree in O'Brien's study was constructed as a cladogram that expresses the common ancestors from which archaeological artifacts evolved, and the similarity between the artifacts was measured based on its characteristics.They also defined taxa based on whether it possessed certain characteristics, and they used the cladistic methodology in biology.
Khanafiah and Situngkir [28] constructed a cladogram-type phylogenetic tree using the Nokia mobile phone product data.The product data consisted of the characteristics that describe the design and technical functions of the phones.Based on this data, the distance between products was measured and a Nokia mobile phone phylogenetic tree was constructed using the UPGMA (Unweighted Pair Group Method with Arithmetic mean) algorithm, which is commonly used in biology.
Tëmkin and Eldredge [29] constructed a phylogenetic tree of Baltic psaltery and the Cornet using PAUP software, which measures the distances between instruments based on characteristics.The authors can observe major innovation events in musical instrument developments through the phylogenetic tree.
Although several studies have employed the product phylogenetic tree to describe the evolutionary process of a product, a majority of these studies only applied the tree construction algorithm in biology to product data.Therefore, these studies failed to reflect the difference between the product and biological evolution.
The difference between the biological and product evolution can be established based on whether the ancestors can be identified.In biology, the existence of ancestors cannot be identified before paleontological discoveries.Therefore, phylogenetic trees in biology set a hypothetical ancestor that is only represented as a set of genes or characteristics.In contrast, products can identify their ancestors by gathering data.
However, previous studies on product phylogenetic trees do not represent ancestors of real products [28], [29], because these studies only used the biological phylogenetic tree algorithm, wherein the hypothetical ancestors were set.Previous literature on product evolution did not develop the unique algorithm for identifying actual ancestors.
The reason why a unique algorithm cannot be developed for the product phylogenetic tree is the lack of the operational definition of the components that constitute the tree.Therefore, this study defines the components that constitute the product phylogenetic tree, such as the product gene, product taxa, and the ancestor-descendant relationship, and develops novel algorithm for product phylogenetic tree using these components.

A. Operational Definitions of Product Phylogenetic Tree Components
An operational definition converts a conceptual object to a scientifically analytical object.Researchers can develop a generalized methodology that helps derive reproducible and objective results based on an operational definition [45], [46].Therefore, before developing a generalized methodology, this section provides operational definitions of the components of a product phylogenetic tree.
1) Product Gene and Genotype: First, the genes of a product should be defined because a phylogenetic tree is constructed based on the similarity of genes in complex organisms.Genes are the units that comprise a genotype, and genotype refers to replicable genetic information that is inheritable to the next generation [14].
To define the genes and genotypes in a product, a definition of the product is required first.As mentioned in Section II.B, many previous scholars have defined a product as a set of technical characteristics [9], [10], [16], [25], [26], [37], [38], [39], [40], [41], [47].These characteristics are similar to genes in terms of being inherited by offspring and gradually improving through the descent with modification [19], [20].Therefore, a product gene (x) can be defined as an element of technical characteristics of a product (Z), and a product genotype can be defined as a set of product genes.
For example, the product can be represented using the mobile product Apple iPad 3 Wi-Fi, as shown in Fig. 1, and the technical characteristics of the product are 4.0 A2DP for Bluetooth, scratch-resistant glass oleophobic coating for protection, and iOS 5.1 upgradable to iOS 9.3.5 for OS.These characteristics can be decomposed into more detailed elements, such as 4.0, A2DP, scratch-resistance, glass, oleophobic, coating, and iOS.These detailed elements are defined as product genes in this study, and the representation of the product in the form of a set consisting of product genes is defined as the genotype of the product.Based on this definition, the genotype of Apple iPad 3 Wi-Fi is {4.0,A2DP, Scratch-resistance, glass, oleophobic, coating, iOS, …}.
The decomposition of a product into a product genotype helps evaluate the similarity between different products.For instance, the OS characteristic of two different products are "Android 12, One UI 4.1" and "Android 11, One UI Core 3.1".In this case, we cannot directly measure the similarity between the two.However, if we convert them to product genotypes as {Android, 12, One, UI, 4.1} and {Android, 11, One, UI, Core, 3.1}, we can measure the similarity because both have Android, One, and the UI as common elements.
2) Product Taxon: Organisms can be classified into homogeneous groups that are distinct from other organisms, and they belong to taxa, such as kingdom, genus, and species, which are the units of biological classification.A taxon is used as a node of the phylogenetic tree.The organisms belonging to the same taxon possess a certain level of homogeneity in genes.Similarly, a product (Z) can also be classified with a certain level of technical homogeneity.In this study, a product group with a certain level of homogeneous technical characteristics is defined as a product taxon (C).The operational definition of a product taxon is expressed in (1).Just as humans cannot be mammals and fish simultaneously, a product also cannot belong to multiple taxa simultaneously, and adding all taxa forms a union set (U ) as expressed in the following equation: (1) 3) Ancestor and Descendant Between Product Taxa: The links in a phylogenetic tree indicate the ancestor-descendant relationships between the taxa.Therefore, it is necessary to define the relationship between ancestors and descendants among the taxa, which exists in adjacent periods to construct a phylogenetic tree.Based on the concept of descent with modification, which is the basic principle of evolution, the descendant products inherit most of the genes from their ancestors.In other words, the ancestor of a descendant product taxon can be defined as the most similar product taxon at the genetic level among the product taxa in the previous period, and this relationship can be represented as (2).For a certain product taxon (C T i ) at the period of descendants (T ), the ancestor (Φ i ) of the product taxon (C T i ) can be defined as product taxon, which has the highest similarity among any product taxa (C T −1 j ) at the period of ancestor (T − 1) [48] T (Descendant) > T − 1 (Ancestor) 4) Structure of Product Phylogenetic Tree: Fig. 2 shows the structure of the product phylogenetic tree constructed in this study.The nodes of the tree are the product taxa, and a link is formed when the taxa have an ancestor and descendant relationship defined in (2).For example, Taxon 2 is the ancestor of Taxon 3 and Taxon 4, and Taxon 3 is the ancestor of Taxon 5-7.Further, Taxon 5-7, which have the common ancestor as Taxon 3, can be defined as siblings.Although the sibling taxa are homogeneous at first, they can be differentiated over time.The phylogenetic trees in biology and previous studies on product evolution do not represent ancestors, such as Taxon 2 as actual organisms and products.However, the algorithm in this study represents these ancestors as actual products.
The lines connecting taxa in Fig. 2 are called branches.As shown in the image at the lower right of Fig. 2, the branch consists of taxa, and possesses consecutive ancestor and descendant relationships.As shown in Fig. 2, if Taxon t0 , Taxon t1 , Taxon t2 , and Taxon t3 , which constitute a branch, have a sequential ancestor-descendant relationship, it can be interpreted that Taxon t0 and Taxon t3 have an ancestor-descendant relationship.Consequently, the branch length expresses the number of generations of evolution resulted from an ancestor.The representation of the amount of evolutionary generation as the length of the branch is an improvement of the previous product phylogenetic tree.Previous trees have limitation to represent the number of generations, because they are constructed as cladograms with equal branch lengths [27], [28].In summary, the structure of our product phylogenetic tree has improved in terms of representing actual ancestor and amount of generations rather than previous trees in product evolution.
Table II summarizes the operational definitions of the components constituting the phylogenetic tree in this section.

B. Algorithm for Product Phylogenetic Tree
This section introduces the algorithm for constructing the product phylogenetic tree using components in Table II.The tree is constructed based on the following steps shown in Fig. 3.The algorithms for these steps are explained and presented in the pseudo-code as a generalized algorithm.
1) Extraction of Genetic Information From Product: A product is a set of technical characteristics, and each characteristic can be decomposed into product genes.In this step, these product genes are extracted from a product.And a product is converted into a product genotype that is a set with extracted genes as elements.
Table III presents the pseudo-code used in this step.All inputs are products and all outputs are product genotypes.First, the product genes, which make up one of the technical characteristics of a product, are extracted and appended to a product genotype.This process is repeated for all technical characteristics in the product to convert a product into a product genotype.By applying this process to all products, all products are converted to product genotypes.
2) Detection of Taxa in Product Network: In this step, the product taxa are derived using the product genotypes from the previous step and a community detection algorithm in network theory.A community in the network means a group of nodes.The nodes in the same community contain more links than the nodes in other communities [49], [50], [51], [52], [53]; this implies that nodes in the same community have higher similarities.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II OPERATIONAL DEFINITIONS FOR PRODUCT EVOLUTION TABLE III PSEUDOCODE FOR EXTRACTION OF GENETIC INFORMATION FROM PRODUCT
Specifically, a community is a group with a certain level of homogeneity.
In Section III. A. 2, we defined a product taxon as a homogeneous group of products.Therefore, the detected communities, which are homogeneous groups of products in each period, can be defined as product taxa in that period.Table IV is pseudo-code in this step.All inputs are product genotypes, which are a result of the previous algorithm, and all outputs are product taxa.Here, T 0 is the start period of product data, T L is the final period, and T is the period between T 0 and T L .
According to the algorithm in Table IV, a product network in T , G T (N, L) is constructed first.Here, N is a set of nodes that are genotypes of products introduced in the market at period T , and L is a set of links that connect the product genotypes in N .Community detection algorithm, such as the Louvain algorithm [54] is applied to the product network(G T (N, L)) to derive the communities of the network.The derived communities are defined as product taxa at period T and the product taxa at period T are appended to all product taxa.By repeating this process from T 0 to T L , all product taxa for each period are obtained.These product taxa are used as nodes in the product phylogenetic tree.
3) Matching of Evolutionary Relationships Between Taxa: To construct the product phylogenetic tree, we should match ancestor-descendant relationships between all product taxa from the previous step.These relationships are used as links in the product phylogenetic tree.
Table V is pseudo-code in this step.All inputs are product taxa from the previous step and the outputs are the ancestordescendant relationships between all product taxa.First, product taxon in T 0 + 1 detects its ancestor based on (2), and this step is repeated for all product taxa from T 0 + 1 to T L .Through this process, all product taxa detect their ancestors and all product taxa are connected based on the ancestor-descendant relationship.By using all product taxa as nodes and the relationships as links, we construct a network with a tree structure, which is defined as a product phylogenetic tree in this study.
In this study, the data for the mobile product was constructed from a website (www.gsmarena.com)by web-crawling.The webcrawler developed using the python library, BeautifulSoup.This website was commonly used to analyze the mobile product in previous studies [35], [36].Our data comprised 10 213 mobile products with 50 types of technical characteristics that were released in the global market from 1995 to 2019.

B. Extraction of Genetic Information From Mobile Product
Each mobile product is converted to a product genotype according to the algorithm in Section III.B. 1.We applied natural language processing, such as stopset, split tokenizer to the technical characteristics of a product for separating them into words.In addition, we removed the small frequent words to prevent bias.Then, we defined the words as product genes.As a result, we derived 3888 genes from 10 213 mobile products.Table VI lists the types of technical characteristics and examples of product genes from the mobile product data.

C. Detection of Taxa in Mobile Product Network
According to the algorithm in Section III.B. 2, we need to construct a mobile product network with nodes as product genotypes in each year.To construct the network, we made a link between product genotypes when the similarity between product genotypes is higher than the average similarity of the year.We used the Jaccard similarity as a metric to measure the similarity between product genotypes.The Jaccard similarity measures the degree of overlap between two sets [55], [56], [57] and can be used to quantify the number of identical genes between two genotypes; it is commonly used to compare genomes [17], [58].The adjacency matrix (M T ) for a mobile product network in year T and Jaccard similarity are represented by the following equation: Otherwise . (3) Here, A T and B T are genotypes of products at year T. Average similarity is derived by adding all Jaccard similarities between the product genotypes in year T and then dividing the sum by the number of products in year T. m A T ,B T is an element in the adjacent matrix (M T ) and represents a link formed between A T and B T if the Jaccard similarity between them is greater than the Average similarity; otherwise, there is no link between them.
After constructing a mobile product network in year T, we derive the product taxa from the network through community detection.We employed the modularity optimization method using the Louvain algorithm [49], [50], [51], [52], [53], [54].The Louvain algorithm has ability to significantly reduce the calculation time and limit the tendency toward large-scale super communities.Using this method, we detected communities for the product network and defined the communities as the product taxa in year T.
This process was repeated for all product networks from 1995 to 2019, and we obtained 111 product taxa from 1995 to 2019.Table VII represents product taxa in 2019 derived through this process.

D. Matching of Evolutionary Relationships Between Taxa
To detect the ancestors, we should measure the similarity between product taxa.Product taxon is a set of products, and simultaneously, a set of product genes that constitute those products.Therefore, in this study, a product taxon can be treated as a document because the product gene was a word.
There are two methods for measuring the similarity between documents.One is embedding a document into a vector and measuring the similarity between the embedding vectors [59], [60].The other is deriving the probability distribution of words in the document and measuring the difference in the probability distribution [61].
To represent a product taxon as an actual product, we used the embedding method, wherein we derived an embedding vector for a product taxon.We find a product for representing a product taxon by measuring the similarity between the embedding vector of a product taxon and the products that constitute the taxon.Specifically, we derived an embedding vector for a taxon using a Term Frequency-Inverse Document Frequency (TF-IDF) method.The TF-IDF method measures the weight of the word in the document by multiplying the frequency of the word with the inverse frequency of documents that contain the word and converts a document in a weighted vector.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE VI TECHNICAL CHARACTERISTICS AND PRODUCT GENES FROM MOBILE PRODUCT DATA TABLE VII PRODUCT TAXA IN 2019
By using the TF-IDF method, a product taxon is converted into a weighted vector that has greater weights for product genes representing the product taxon.We used the weighted vectors of product taxa to measure the similarity between taxa for matching the ancestor-descendant relationship between all product taxa according to the algorithm in Table V.
In addition, we can represent a product taxon as a representative product by using an embedding vector of the product taxon.A product in the product taxon can be represented as a vector.The vector has 1 as an element if a product gene exists in the product, and 0 otherwise.We measured the similarity between the embedding vector of the product taxon and the vector of a product in the taxon.We represented a taxon as a product with the highest cosine similarity among the products manufactured by the top two firms with the largest number of products released in the taxon.Then, we construct a network with the representative products of product taxa as nodes and links as ancestor-descendant relationship in the form of a tree.

V. ANALYSIS OF MOBILE PRODUCT PHYLOGENETIC TREE
The result of constructing a mobile product phylogenetic tree according to the method described in Section IV is shown in Fig. 4. In this study, "Smartphone" is defined as a mobile product with an OS as a technical characteristic, "Pure Featurephone" is defined as a mobile product without an OS, and "Pseudo-Smartphone" is defined as a product without an OS but with characteristics, such as a keyboard and touchscreen.
The name of a representative product is represented if a taxon has more than three descendants or is a terminal node or when it is in the important branches.The reliability and usability of the resulting tree are verified in this section.

A. Verification of the Tree From Mobile Product History
The reliability of the model can be verified based on whether the model can describe historical events well [62].Using this approach, we verify the reliability of the operational definitions Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.and the algorithm by confirming the resulting tree well describes historical events, such as the emergence of smartphones and the growth of Chinese firms.
1) Emergence of Smartphone: In the global market, Apple and Blackberry were among the first companies to introduce smartphones, which became commercially successful in the mid-2000s.The global sales of smartphones exceeded the sales of the traditional mobile product in 2013 [35].
From the resulting tree, we observed that there is no descendant of the taxon have a pure featurephone as a representative product (purple in Fig. 4) after 2011 and a pseudo-smartphone (brown in Fig. 4) after 2013.In contrast, a taxon has a smartphone as a representative product (green in Fig. 4) and continued its generation by giving birth to descendants.It can be observed that only the descendants of the smartphone taxa survived after 2013.

2) Growth of Chinese Firms:
The second historical event is the growth of Chinese manufacturers in the late 2010s [63].Since 2016, products from Chinese manufacturers appeared as representative products of product taxa in the mobile product phylogenetic tree (grey in Fig. 4).This means that Chinese manufacturers have released many mobile products to the global market since the late 2010s.

B. Robustness Check
The generalized methodology for constructing the product phylogenetic tree presented in Section III comprises the similarity metric, the construction of the product network function, and the community detection function.Among these, the similarity metric for constructing a product network can make a difference in the resulting tree.Therefore, we performed the robustness Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
check for the alternative similarity metrics, such as Euclidean similarity, cosine similarity and gene co-occurrence [56], [57].The results of the robustness check show 80% level of consistency with the original tree.From the result, the robustness of our model for the similarity metric is confirmed.The details of the robustness check are shown in Appendix A.

C. Application of Product Phylogenetic Tree 1) Speciation Theory in Mobile Product Phylogenetic Tree:
The methodology in this study can be used as a tool to describe the evolutionary patterns of products.The tool can provide empirical evidence to explain the evolutionary phenomena of a product along with the theories of product evolution presented by qualitative approach only.As an example, we explain the speciation phenomenon of the smartphone by using the Levinthal's technological speciation theory presented in 1998 and the mobile product phylogenetic tree in this study.
Fig. 4 shows four branches that split from the Samsung S100 in 2002.One of the branches is connected to Qtek 2020i (green), which adopted Microsoft Windows Mobile as the OS.Further, the smartphone taxa evolved along the branch.The mobile phylogenetic tree in this study suggests that the phylogeny of a smartphone branch began in 2003.However, there is a general perception that smartphones first emerged in either 2006 or 2007, with the introduction of the BlackBerry and iPhone [35].To reconcile these different interpretations, the observable phenomena on the phylogenetic tree and general perception, it is necessary to understand the process of speciation in biological evolution.
1) The process of biological speciation is as follows.
2) There is one homogeneous species, and 3) one species is separated into two or more species whose genetic flow is hindered by each other through a barrier.4) As the genetic flow decreases, the species become genetically distant, and 5) when genetic homogeneity sufficiently disappears to stop procreation, the original species is divided into different species even when the barrier disappears [32].From the mobile product phylogenetic tree, it can be confirmed that the biological speciation process is the same as the product evolution process.
1) There was a homogeneous taxon represented by the Samsung S100 in 2002, and 2) it was divided into four taxa in 2003.
3) These taxa become technically distant depending on whether they possessed the technical characteristics of an OS, and 4) eventually they were divided into different product types, such as pure feature phones, pseudo-smartphones, and smartphones.To explain the product speciation of smartphones based on this speciation phenomenon in the mobile product phylogenetic tree, the starting point of the smartphone in 2003 was identified as the beginning point of the smartphone product.This can be referred to as the origin of the smartphone species.
It is difficult for a new species to emerge, survive, and continue for generations.Among the origins of various species that appear through speciation, the species that survive and leave descendants are not clear.Levinthal [7] stated that for technology and products to survive and perform speciation successfully, a new domain to which they can be applied is required, and that this domain must have a new market selection criterion and resource abundance.New technologies and products must be utilized in the domain for generating new technologies and product species.This process is called lineage development.
The Apple iPhone, which was founded in 2007, can be interpreted as providing a domain for smartphone lineage development.The App stores proposed by Apple served as an opportunity to utilize the technical characteristics of OSs actively, and the extensive content from the App store resulted in a new selection criterion for mobile products.The iPhone provided an environment that made the smartphone species indeed a successful new product type.
In conclusion, through the mobile product phylogenetic tree and Levinthal's speciation theory, it is possible to identify the origin of the smartphone species in 2003 and the role of Apple products, which helped establish an appropriate environment to develop the lineage of smartphones.
2) Evolutionary Trajectories of Major Firms in Period of Emerging Smartphone: It is important for firms to identify where their products exist in the technological space [56], [64].Our tree can describe the evolutionary position of products based on technical information of the products.Based on the evolutionary position in the tree, we can describe firms' product evolutionary trajectories and explain the impact of the choice on the trajectories on their fate.For analysis, we referred to major firms, such as Apple, Samsung, Nokia, and LG.They have been known to possess extremely different fates before and after the emergence of smartphones.
If each firm's products are more than 10% of the total number of products in a taxon, we represent it as a representative product of the firm.The rule for finding a representative product of the firm is the same as that in Section IV.D; the only exception being the use of products that are released from the firm.Unlike other firms, we set a representative product if any Apple product is in the taxon because Apple only released a few mobile product models.
In Fig. 5(a), it can be seen that the Apple iPhone appeared in the smartphone taxa branch in 2007.Many Apple products also existed in branches from smartphone taxa.The tree shows that Apple's trajectory is always the primary trend setter in modern mobile product evolution.In other words, Apple's success is due to its choice of an appropriate product evolutionary trajectory.Apple and LG clearly had different approaches for choosing their trajectories in the tree.Apple chose smartphone taxa, but LG did not.Unlike these two firms, Samsung and Nokia released their products in both smartphone and pseudo-smartphone taxa.Therefore, we investigate the trajectories of Samsung and Nokia specifically to understand why they achieved different results in the smartphone market.shown in Fig. 6(a), smartphone taxa are the smallest in 2007, but they continue to grow.After 2008, the smartphone taxa have the highest probability to survive since they have the most products.As is evident from Fig. 6(b), Samsung continuously increased the number of products in smartphone taxa and changed its major trajectory to smartphone taxa in 2011.However, Nokia chose the pseudo-smartphone taxa as the major trajectory since 2011.These differences in decision determined the fate of Samsung and Nokia.
3) Extant and Future Mobile Product Categories: The resulting tree represents the mobile product types in 2019, which is the final year from which data were obtained.Mobile products are classified into four taxa (blue taxa in Fig. 4).
These four taxa can be categorized as a smartwatch and the others.Taxon represented as the Apple Watch Edition consisted of only smartwatch products, such as Apple Watch, Samsung Galaxy Watch, and the Xiaomi Mi Watch.This means that the smartwatch formed a distinct category in 2019.Although the taxon consisted of only nine products in 2019, the smartwatch category is expected to evolve gradually based on the taxon.
The three other taxa in 2019 consisted of smartphones and tablet PCs.This means that smartphone and tablet PC do not have distinct differences in technology like the smartwatch until 2019.Therefore, they are expected to evolve within a single product category with a strong correlation.

A. Significance of the Study
This study presented operational definitions of components that contribute to the evolution of the product, including the product gene, product taxon, and the ancestor-descendant relationship.Based on these definitions, a generalized methodology was introduced for describing the evolutionary process of products, called the product phylogenetic tree.
To verify the accuracy of the operational definitions and methodology, a mobile product phylogenetic tree was constructed using mobile product data from a website (www.gsmarena.com).The resulting tree appropriately described historical events, such as the emergence of smartphones in mid-2000s and the growth of Chinese firms in late 2010s.Through experimental verification, we confirmed the reliability of the proposed definitions and methodology.In addition, we evaluated the robustness of the model based on the similarity metric.
Based on the verified mobile product phylogenetic tree, we presented application methods based on three aspects.First, we used the resulting tree and Levinthal's speciation theory for explaining the speciation of smartphones.We identified the emergence of smartphones following stylized speciation patterns using the tree.We qualitatively explained that the App store provided a new market selection criterion and resource abundance, which are essential for successful speciation according to Levinthal's theory.Based on these factors, we identified the suitability of this methodology for use as a generalized tool to explain various product evolution phenomena along with previous product evolution theories.
Second, the resulting tree revealed information related to the firms' evolutionary trajectories.Based on this information, firms can make decisions to alter their fate.Our tree identifies the steady advances in the smartphone industry since 2003.Apple chose this successful smartphone taxa as its trajectory, but because LG did not choose smartphone taxa, they lagged behind in the market.Samsung did not choose smartphone taxa at first, but it changed its major trajectory to smartphone taxa in 2011, which renewed its standing in the market.Nokia chose smartphone taxa at first but changed its major trajectory to pseudo-smartphone taxa in 2011, which affected their competitiveness.Based on these factors, we can identify the feasibility of our methodology for use as an effective tool for decision making in a firm.
Third, we could predict future product types based on the present product types on the tree.From the resultant tree, we observed that smartwatch products were being divided into a distinct product type since 2019.Therefore, the smartwatch product type is expected to evolve along its own pathway in the future.However, tablet PCs and smartphones coexist in the same taxa.This suggests that no technical criteria existed for distinguishing the tablet PCs and smartphones until 2019.
The theoretical and practical implications of the findings are described herein.From a theoretical perspective, our study suggests the possibility that product evolution can grow into a scientific field.Specifically, this study presented the operational definitions for converting a product to an empirical object based on the product evolution theories of previous studies.Based on these definitions, a scientific and unique algorithm for describing the evolutionary process of a product is developed.Our data-driven and algorithmic approach to analyze the product evolution phenomena can guide the research and development of product evolution.
From a practical perspective, our product phylogenetic tree provides information about products that are more likely to survive in the market.As shown in Fig. 6, our tree identifies the branch that consists of smartphone taxa, which is the main branch since 2008.Apple and Samsung are the major firms that launched the most products on the main branch and have survived.Other firms, such as LG and Nokia, are currently struggling in the mobile market, did not choose the main branch.Thus, our tree offers a new methodology for decision making in firms because it provides information on products that are likely to survive.

B. Limitation and Future Research
Although the operational definition and methodology for product phylogenetic tree are defined based on the data and algorithms, the limitation of our study is that we used a qualitative approach to validate the resulting tree, analyze firms' strategies, and predict future product types.Therefore, further research is required to develop an objective and quantitative analysis method for product phylogenetic trees.Thus, future studies should focus on technological trajectories based on using patent data [56], [65], [66], [67], [68], [69].The technological trajectory research already consists of quantitative methodologies and indices for analyzing technological evolution, such as the main path analysis [65], [66], [67], genetic analysis [69], topic extraction [61], entropy index [70], and an expandability and coherence index [56].The quantitative methodology and indices for product phylogenetic trees can be developed based on these previous studies, and as a result, the trees can be actively used as a generalized methodology for the product evolution.
In addition, the product phylogenetic tree only considers the technological data of a product.However, products are subject to selection by the market.To accurately analyze the product evolution, we should evaluate the demand data along with the tree.In Section V. C. 1, we explained the speciation of mobile products considering the demand aspects, such as App stores but it was a brief and qualitative explanation.Therefore, if demand data and product phylogenetic tree are combined, future studies are expected to give more accurate explanations and predictions of the evolutionary patterns in products.

APPENDIX
The generalized methodology for constructing the product phylogenetic tree, described in Section III, mainly consists of similarity metric, constructing product network function, and community detection function.Among these, the similarity metric of the algorithm for constructing a product network can make a big difference in the resulting tree.Different similarity metrics result in differences in the product networks.This causes differences in product taxa, which are the nodes in the product phylogenetic tree.Therefore, we performed the robustness check for the alternative similarity metrics.
According to our algorithm, the taxa derived from Jaccard similarity and taxa from alternative similarity are different.They should be matched for the purpose of comparison.We match a taxon from Jaccard similarity and that from alternative similarity if a taxon from alternative similarity contains a representative product of the taxon from Jaccard similarity.
For the robustness check, we develop and measure three indices-number of taxa, taxa match rate, and ancestordescendant match rate.
The number of taxa index is the relative ratio of the total number of taxa using an alternative similarity metric to the total number of taxa using Jaccard similarity.This index measures the impact of the number of product taxa on various similarity metrics.
Taxa match rate is a measure of the unchanged composition of a product taxon.It is derived by counting the number of overlapped products between a product taxon using Jaccard similarity and matched product taxon using alternative similarity, and then dividing the number of overlapped products by the number of products of the taxon using Jaccard similarity.
The ancestor-descendant match rate measures the extent to which the ancestor-descendant relationship is maintained.This index is one if ancestor-descendant relationship using Jaccard similarity is maintained as the relationship between matched taxa using alternative similarity; otherwise, it is zero.
If the three indices in an alternative metric are closer to one, the result is the same as that for Jaccard similarity.We use three alternative similarities-Euclidean similarity, cosine similarity, and gene co-occurrence.Average values of the taxa match rate and ancestor-descendant match rate are used for checking the overall level of change.
As shown in Appendix A, the cosine similarity and cooccurrence cases exhibit a consistent rate greater than 80% for all indices.However, in the Euclidean case, the number of taxa is higher than that in the Jaccard case and a slightly lower consistency rate is observed in the average ancestor-descendant match rate than in the cosine and co-occurrence cases.This is because Euclidean similarity yields a high average similarity, resulting in a higher number of smaller-sized taxa compared to other similarity metrices.These small size taxa are the root cause of the error.Although the Euclidean case yielded a slightly low consistency level, the level of consistency was approximately 80% for all similarity metrices.Based on these results, the robustness in the similarity metric of our product phylogenetic tree model was evaluated.

Fig. 5 (
b) depicts an evolutionary position of LG products on the tree.The most interesting pattern of the LG mobile product is that they have not been in the smartphone taxa from 2003.The representative products of LG exist only in pure featurephone and pseudo-smartphone taxa.It can be clearly observed that LG chose the wrong trajectory.Even during the development of the smartphone taxa, which began in earnest after the advent of the

Fig. 5 (
Fig.5(d) shows Nokia's products.Nokia released their mobile products to both the pseudo-smartphone and smartphone taxa.However, after 2010, an extinction phenomenon wherein Nokia products cannot be identified as representative products in the smartphone branch, is observed.Apple and LG clearly had different approaches for choosing their trajectories in the tree.Apple chose smartphone taxa, but LG did not.Unlike these two firms, Samsung and Nokia released their products in both smartphone and pseudo-smartphone taxa.Therefore, we investigate the trajectories of Samsung and Nokia specifically to understand why they achieved different results in the smartphone market.Fig. 6(a) represents the number of products in each taxon from 2007 to 2012.Fig. 6(b) shows Samsung's products only and Fig. 6(c) shows Nokia's products only.As Fig.5(d) shows Nokia's products.Nokia released their mobile products to both the pseudo-smartphone and smartphone taxa.However, after 2010, an extinction phenomenon wherein Nokia products cannot be identified as representative products in the smartphone branch, is observed.Apple and LG clearly had different approaches for choosing their trajectories in the tree.Apple chose smartphone taxa, but LG did not.Unlike these two firms, Samsung and Nokia released their products in both smartphone and pseudo-smartphone taxa.Therefore, we investigate the trajectories of Samsung and Nokia specifically to understand why they achieved different results in the smartphone market.Fig. 6(a) represents the number of products in each taxon from 2007 to 2012.Fig. 6(b) shows Samsung's products only and Fig. 6(c) shows Nokia's products only.As
Appendix A. Robustness check for similarity metric.

TABLE IV PSEUDOCODE
FOR DETECTION OF TAXA IN PRODUCT NETWORK