DeepVisInterests : Deep data analysis for topics of interests prediction

Deep data analysis for latent information prediction has been an important research area. Many of the existing solutions have used the textual data and have obtained an accurate results for predicting users’ interests and other latent attributes. However, little attention has been paid to visual data that is becoming increasingly popular in recent times. In this paper, we addresses the problem of dis-covering the attributed interest and of analyzing the performance of the automatic prediction using a comparison with the self assessed topics of interest (topics of interest provided by the user in a proposed questionnaire) based on data analysis techniques applied on the users visual data. We an- alyze the content of each user’s images to aggregate the image-level users’ interests distribution in order to obtain the user-level users’ interest distribution. To do this, we employ the pretrained Ima- geNet convolutional neural networks architectures for the feature extraction step and to construct the ontology, as the users’ interests model, in order to learn the semantic representation for the popular topics of interests deﬁned by social networks (e.g., Facebook). Our experimental studies show that this analysis, on the most relevant features, enhances the performance of the prediction framework. In order to improve our framework’s robustness and generalization with unknown users’ proﬁles, we propose a novel database evaluation. Our proposed framework provided promising results which are competitive to state-of-the-art techniques with an accuracy of 0.80.


Introduction
The last decades have witnessed the boom of deep data analysis techniques with the huge amount of textual and visual data provided in many social networks (e.g., Facebook). Data analysis system provide a useful and efficient way to power prediction-required systems [3]. The great advancements on these systems using deep learning and its improved accuracy abilities, compared to traditional machine learning methods, have led to renewed efforts in social network analysis [6]. Hence, managing and understanding the data provided in social networks (e.g., Facebook), are still important research challenges [20] that are useful for diverse applications. According to this, we focused our work on the data analysis techniques by the combination between the convolutional neural network architectures and the knowledge semantic representation to model the users profile with the ontology and to predict the user's interests distribution. Particularly, we focus on pretrained ImageNet Convolutional Neural Networks architectures (CNN) [11], which have become a highly recommended feature descriptor in many computer vision areas [4]. However, our proposed framework, DeepVisInterest, performs the users' interest prediction task based on a deep neural approach for the ontology construction and a list of topics of interest illustrated in table 1. We increased the classification accuracy, in addition to the ability to handle sophisticated presentation attack conditions. Consequently, this would greatly hinder the effectiveness of users' interest prediction systems. ORCID(s): The major contributions of this paper consist of: • Developing a novel framework named DeepVisInterests that performs the users' interests prediction task based on the pretrained ImageNet convolutional neural networks.
• Designing a new ontology using a set of deep visual features in order to learn the semantic representation for the popular topics of interests.
• Evaluating the system's generalization ability by conducting our novel database testing.
This paper is structured as follows: First of all, we start by reviewing the relevant works in section 2. Then, we provide our proposed database in section 3. Next, we elucidate a comprehensive explanation of our proposed approach in section 4 giving details pertaining to each phase. Finally, section 5 includes the achieved experiments with obtained results. In section 6 we discuss the achieved results on our constructed database. Meanwhile, we enclose our paper with some notes in section 7.

Relevant Works
The literature provides a various methods and purposes of deep data analysis. Due to the rapid advancement of deep learning, the data analysis has become more challenging to discover various users' latent attributes in practically the users' interests . In fact, the users' interest discovery process require the modeling phase to enhance the predicting phase.

Users' interest modeling
In [15], the authors suggested a Latent Topic of the users' interest (LUI) model to manage the topics distribution of tweets which possesses non-Gaussian characteristics. To evaluate their model, the authors employed two mictoblogs Weibo and Twitter to construct two dataset that contain respectively 10 million tweets and 100 million tweets and ultimately obtained a correlation coefficient between topics of 64.0% and 63.0% for each mentioned dataset. The work proposed on [5] presented a users' interest model based on Latent Dirichlet Allocation to model topics from forums so as to distinguish between the users' interest topics and the unserious interest topics. To validate their model, the authors employed a forum thread from Tianya nad they obtained an accuracy of 80.5% and 93.3% for serious and unserious users respectively. Also, Yang et al. [22] proposed a framework that integrates users' behavior on opinions and preferences aspects . The component has the capability to infer numerical ratings on the multiple aspects when such ratings are missing or not explicitly presented. To construct the users' model, firstly, they used LDA to cluster each aspect terms into latent aspects, secondly, the tensor factorization approach is applied to automatically extract weights of various aspects while approaching an overall numerical rating and finally a simple algorithm is used to computes an overall rating of an item based on both aspect ratings and weights. To validate their model, the authors used two real datasets: Internet Movie Database (IMDB) containing 193,266 reviews written by 83,585 users from the Internet Movie Database website and Hotel reviews Database containing 81,085 reviews from 879 users. In [17], the authors exploits the users' social data for developing aspects based sentiment analysis framework. This framework is based on the neighborhood CF algorithm, KLdivergence and a multidimensional Euclidean Distance to model the users' social data. To evaluate their proposed model, Musto et al., [17] used three database Yelp with 11,537 reviews from 45,981 users, TripAdvisor with 3,954 reviews from 536,932 users and Amazon with 50,210 reviews from 826,773 users.

Users' interest prediction
Some recent research efforts have been made to exploit data provided in social network using data analysis and deep learning techniques in users' interest prediction. In [24], the authors suggested a novel approach for the image-level and group-level label propagation of the users' interest prediction. They employed the AlexNet architecture for deep vi-sual feature extraction and the image-level similarity to propagate the label information between images in order to disseminate the topics of interest-level for all the user's images. To validate their approach, the authors used a novel database collected from Pinterest containing 6000 images from 300 users' account and obtained an accuracy of 43.0%. In [21], the authors proposed a method of assessment following social users based suggestions on categorical classification interest. This method relies on the convolutional neural network architectures and on hierarchical topic of interest categorization. To validate their approach, the authors used a database containing 20,500 images collected from Pinterest and obtained an overall precision of 39.9%. In [2], the authors presented a classification of social images using unsupervised learning algorithm for users' interest prediction. To verify the validity of the proposed method, the authors used a novel database containing 800 social images collected from Pinterest and obtained an accuracy of 68.0%. Table 2 illustrates the topics of interest used by the already mentioned related works.

Discussion
Performing the users' interests prediction model needs deep understanding of a users' social data. From the literature it is obvious that, this prediction has been obtained by analysing explicit social data. Although, prior works have examined the performance of ontology and deep learning techniques for objects detection in mining users' latent information.In fact, obtaining more accurate prediction needs deep understanding of users' social visual data. Beginning with the users' interests modeling phase. The LDA approaches presents users' data with a multi-nominal distribution distribution of words and documents while TF-IDF considers the ratio of the frequency of terms in each word over the total number of terms as a topic of probability. These methods achieve better accuracy in various research areas but, one limitation within these methods is the representation of data employing the Bas-of-Words method. Concerning the users' interest prediction, topic modeling method exploits the social data to predict latent information about social users such as the topics of interests. Plentiful research has been performed in the domain of users' latent information prediction. Hence, the main objective is to classify social data: textual or visual data into some classes, according to the prediction area ( sentiments, opinions, preferences, etc) in order to understand the social users' behavior. Then, several attempts have been done to categorize the social data and the literature attain good performance but it suf-   fer from some limits like the use of Facebook as the social data source instead of Pinterest. In fact, a very little works has been done that use the social visual data shared on Facebook for developing a supervised deep learning based users' interest prediction. Hence, Facebook present the most important social network to several users types and it becomes an important way to share a huge amount of data daily. Also, some recent effort have been made to exploit social data adopting deep learning approaches in this prediction and the mentioned works used the traditional topic modeling methods like LDA and feature extraction methods like AlexNet architecture.

VisualDatabase: Pictures and Interest
Data provided in social networks are considered as sensitive data as they reflect the private life of social media users. Therefore, creating a social database would require the approval of social media users to gather data so that their private life is respected. To reach this end, an application has been created [13] allowing users to voluntarily subscribe, and therefore, give us the permission to gather the data they publish on their social networks (e.g., Facebook) accounts/pages. Moreover, only abstract images with natural scenes and neutral themes have been selected to avoid personal images/pictures of the user's family or friends. Our database, named VisualDatabase, is a set of multiple social images, which was created in March 2018. This database contains social data from 240 accounts, simply called social users. For each social user, there are 100 random images selected from the "liked" and "shared" ones. The database images have 320*320 resolution, with multiple ethnicity and locations (Africans) and from different age intervals (between 15 and 60 years old). Figure 1 shows some image samples. Furthermore, the database presents the self-assessed users' interest based on a big interest questionnaire (BI) voluntarily filled by each user. The result of the BI is a vector where each component indicates the disposition of a user with respect to the core topics presented in table 1. The self assessed traits are examined to be the validated user' score interest.

Proposed Approach
The problem of users interest prediction may be considered as an image classification problem. However, in constrat to the traditional image classification where the objective is to maximize classification performance at individual level, we are based more on learning the overall user-level image distribution. Our proposed framework, named DeepVisInterest, is illustrated in figure 2 and is based essentially on users' interest modeling phase followed by the prediction phase.

Phase 1: Users' interest modeling
The ontology have been conceived to alllow for a common definition if concepts, entities, relationships, situations and events and consequently for common understanding and for promoting information exchange. In fact, we observe that ontologies have been used in deep learning techniques to address uncertainty in image classification based on ob- ject detection methods [8]. This observation led us to exploit ontologies as a users' interest model to classify each user's images under 24 topics, already presented in table 1, and predict the user's topic of interest. An overwiew of our constructed ontology is illustrated in figure 3. In our proposed ontology, the sub-concepts are the 24 topics. We have used 24 banchmark database to exhibit the 24 topics (each database contains approximately 200 images). Taking an unannotated image as input, we employ the pretrained Im-ageNet CNN architecture for object detection to detect the top 5 ranked objects, among the 1000 objects of ImageNet, that will be incorporated into ontology as the end-concepts that are linked for the sub-concepts by the object property "is-a". The main architecture comes from the CNN presented by Krizhevsky et al. [12] that has obtained the literature performance in the challenging ImageNet classification issue. For a more detailed description of these architecture, we direct the reader to the original paper. In fact, the ontology implementation features represent the number of concepts, data properties, object properties, individuals and axioms. To construct out ontology, we used the tool named Protégé versioned 4.3 and this also provides Ressource Description Framework (RDF) schemas and XML scripts for using ontology in web. The Protégé is an ontology development environment whith a huge number of active users. This environment has been extended with support for OWL (Web Ontology Language) and has become one of the leading owl tools [1]. Our proposed ontology applies all the implementation features while the object properties are used to define the relationship between individuals.

Users' interest ontology vocabulary:
We defines the basic metrics for the size of the users' interest ontology on various aspects. The size of our ontology is defined as follows: Let O be the users' interest ontology: ( ) = 33, ( ) = 5, ( ) = 443, ( ) = 32 and ( ) = 1555.
• I is a collection of finite sets indexed by C as follow • A is a collection of a set of attributes with A = | ∈ . Each ∈ is an attribute of concept c.
• The value of each attribute for an instance ∈ of concept c, is presented by ( ). ( ) is either a data value or type T or an instance of concept c.
• R is a set of binary relations on the set of concepts. R = 1 , 2 , ..., . For each ∈ and ⊆ ′ , Users' interest ontology structure: Structural metrics are the most immense examining metrics in the ontology presentation, exactly, cohesion metrics that measure the degree of relatedness between concepts. Among the cohesion metrics, we find the relation-based structural complexity. In fact, for each ∈ we have some few structural metrics such as • the number of Root nodes with ( ) = || ( )||, • the number of Leaf nodes with ( ) = || ( )||, • the maximum length of simple path with • the number of Isolated nodes as • the total number of Reachable nodes from Roots with • the average number of Reachable nodes from Roots with For the users interests ontology, the "is-a" relation based structure metrics are:

Users' interest ontology context:
We focus on users' interest predicting and is interested in if an ontology is a perfect tool for modeling the semantics of topics of interests. Let assume that a user U possesses n topics of interests 1 , ..., which contain a set of concepts ,1 , ..., , and a set of attributes ,1 , ..., , . An ontological description of the semantics of such a topic of interest consists of the following expressions: • which describes the functionality of the topic , • , which defines the meaning of the parameter , in the set of concepts, • , which illustrates the meaning of the parameter , , in the set of attributes of each concept.

Semiotic metrics assessment:
The quality of each ontology is defined across a set of semiotic metrics. These metrics assess the syntactic, semantic, pragmatic and social aspects of ontology quality. Then, we use Protégé-OWL-5.2 which is an open-source platform that provides tools to construct domain models and knowledgebased applications with ontology. Figure 3 illustrates the incorporation of CNN architectures outputs in our ontology. Each input image is represented by a vector containing a set of scores of the top 5 concepts among 1000 concepts of ImageNet [7] then the ontology vectorization method is used to find each user's interest using OWL API.

Benchmarks
To model the users' interest , we conducted our ontology construction on twenty four publicly available databases, that represent the twenty four topics mentioned in tabel 1, such as: Food Images database [18], Sport Event Database [14] and DeepFashion Database [16]. These are challenging databases, since they consist of various types of users' behavior with different image qualities (high definition, average, and very low quality). In the following, we give a detailed description of these databases.

Food Images DataBase
The database comprises 568 food images including sweet (e.g., ice cream, chocolate), savory (e.g., pistachios, sandwiches), processed (e.g., hamburger, French fries, potato chips, chocolate bars) and whole foods (e.g., vegetables and fruits) and beverages (e.g., coffee, orange juice). Images were selected from a commercially available database (Hemera Photo Objects). All images are color photographs with a resolution of 600 × 450 pixels.

Sport Event DataBase
The Sport Event database contains 600 images including 6 sport event categories that are rowing, badminton, polo, bocce, snowboarding, croquet, sailing, and rock climbing. Images are divided into easy and medium according to the human subject judgement. Each image provides some information of the distance of the foreground objects. All images are color photographs with a resolution of 600 × 450 pixels and they are collected from non-copyrighted sources on the internet.

DeepFashion DataBase
The DeepFashionDB contains 800 diverse fashion images ranging from well-posed shop images to unconstrained consumer photos.All images are color photographs with a resolution of 600 × 450 pixels.

Phase 2: Users' interest predicting
Our prediction phase requires the construction of Visual-Database to propagate the topics of interest distribution from image-level to user-level.

Visual Users' interest prediction
The visual users' interest prediction is based on the combination between the image and user-level (VUIP-IL/UL) methods. Figure 4 illustrates the main steps of our proposed method named VUIP-IL/UL.This method is besed on 3 principal steps. In the first step, we apply the pretrained Ima-geNet CNN architectures for object detection to extract the deep visual features from VisualDatabase. These features display the image objects with their probabilities. Thus, by inferring our users' interest ontology (UIO) model, each image object is replaced by its corresponding super-concept. In the second step, the probability and occurrence-based scoring mechanisms are applied to obtain the image-level distribution. In the third step, we build a mapping matrix from this level to that of the user-level.
Scoring users' interest: To quantify the users' interest, once the user's interest are predicted, we use a scoring function to find the weight of each topic. This mechanism is very powerful as the user's interest scores will be applied to determine the adapted interest distribution for each user's image and therefore, for each user. We use a probability and occurrence-based scoring mechanisms. The topic's score of each image ∈ posted by a given user ∈ , may be measured by the probability and occurrence of an object ∈ where an image is represented by a collection of objects .
where ∈ , ∈ . Here, is a set of probabilities within each ∈ obtained by each image' object and is a set of occurrences with object ∈ for the given image. Algorithm 1 demonstrates the detailed steps used in our prediction task.
• U: Users in test database.
• I: collection of shared images of each ∈ .

Image-level for the users' interest distribution
After applying the feature extraction step, each image possesses 5 objects with their probabilities; such as (espresso, 0.08), (cup, 0.07), (dough, 0.06), (ladle, 0.05) and (sandal, 0.04). Using the Fact++ reasoner and DL query, we infer the users' interest ontology to result the super-class for each image object. We use the data property "has-Instance" in order to generate in the super-class for each input image's object presented as an ontology instance. This step applies the Fact++ reasoner and the DL query: (has-Instance value "image object"). For example, (has-Instance value "espresso") For more details, see the Algorithm 2. Accordingly, we describe two matrices * 24 and ′ ′ * 24 to be the affinity matrices between the twenty-four core topics of interest and the shared images by a specific user as . [!htb] [1] • U: users in the test database.
• T: List of 24 core topics of interest.

Apply P for object recognition on each
to extract the five objects with high probability. 2. Infer UIO, using Fact++ reasoner and DL queries to predict the super class for every object per . 3. Scoring image-level of the users' interest distribution. 4. Repeat : 5. Employ 1, 2 and 3 for all I. 6. Until n for the three cases: = 5, = 10 , = 50, = 75 and = 100. 7. Return G, G': weight matrices of probabilities and occurrence based scoring mechanism.

User-level for the users' interest distribution
According to the already-explained image-level, each user ∈ possesses two weighted matrices and ′ for shared images on social networks (e.g., Facebook). For this level, we intend to generate the target user's interest distribution matrix based on the two scoring mechanisms which are (a) it first treats the matrix , we define: where ( , )is the user's score about the topic with , is the probability of image for topic and is the number of shared images. (b) the second mechanism treats the matrix ′ , we define for k= (1,24) and i=(1, n): where ( , ) is the user's score about topic with , is the probability of image for topic and is the number of shared images. For more details, see the Algorithm 3.

Experimental Analysis and Discussion
In this study, our aim is to investigate the impact of visual data on users' interest prediction problem. To this end, we led an intensive experimental study by generating multiple models from different images combinations. We have eventually reported the obtained results from each combination. We use the publicly available implementation Caffe [9] to test our model. All of oyr experiments are evaluated on a Linux X86-64 machine with 32 GRAM.

Correlational study between topics of interest
where Cov is the covariance, is the standard deviation of and is the standard deviation of . As indicated in figure 6, firstly, we notice that the topic Food is totally highly positive correlated with the topics Drink with a value greater than 0.5. This high correlation means that the images containing objects belonging to the superclass Food, in our UIO ontology, may contain objects belonging to the super-class Drink or Family or People. Secondly, we observe that the topic Fashion is highly negative correlated with topics Technology with a value −0.553 which means that the user who has Fashion as a topic of interest can never be interested in topics Technology. Finally, the topic Education, for example, is correlated with the topic Culture with a value 0.033.

Image-level for the users' interest distribution study
To illustrate this level, we propose a demonstration about the Family class. According to the figure 7, we remark that this distribution is articulated around three topics which are Outdoors, Drink, People and Food. These results, validate the positive correlation between these topics and the fact that a user who is interested by the topic Family, he/she may share images belong these discovered topics. For the same class, the figure 8 illustrate this distribution level using 100 shared images per user. In fact, with 100 images, we remark that the images distribution has become more detailed with the appearance of new topics with low scores such as Shopping, Places and Entertainement and the disappearance of the self-assessed topic for some users. This appearance and disappearance is explained by the diversity of images that can generate vectors with low scores for several topics.
To conclude, we can assume that in the social networks (e.g., Facebook), each user may share a set of images whose can be related or not related to his/her self-assessed topic of interest. With 50 images, the distribution is generally very close to reality with the appearance of the self-assessed with the most important score and with 100 images, this distribution has become more detailed with new topics assigned to that are correlated with self-assessed topic for each user.

User-level for the users' interest distribution study
At this level, we attempt to define the confidentiality area of the users' shared images. For this reason, we used k shared images by each user within the 24 classes. From each class, we obtained a confidential area which generates the target users' interest matrix with a high score for the selfassessed topic. Table 5 describes the variation of the accuracy measure for each class according to the number of shared images per user. This variation is assigned to the fact that each user's topics of interests distribution consists of three layers: starting term with 5 images, middle term with 10 images, long term with 50 images, very long term with 75 images and extreme term with 100 images.
• The starting term presents the sharing of the first 5 images that each user chooses indicating their selfassessed topic of interest, • For the middle term, the same user may be influenced by other topics and can share some images that can interrupt our classification, which explains the decrease of the system performance from 0.85 to 0.75, • In the long term, after being biased in the middle term, the user settles back in his/her self-assessed topic of interest while our system predicts the correct target class with 50 shared images to obtain an accuracy of 0.95, • In the very long term, the user keeps a stability with a slight disturbance of his/her distribution obtained in the long term. For this reason, our system shows a slight decrease of performance with an accuracy of 0.80, • In the extreme long term, our system performance undergoes a very remarkable decrease with an accuracy of 0.65, which validates that beyond 50 images, the distribution of the topics of interest encounters a disturbance by the diversity of the images, which negatively influences the self-assessed topic.
To better visualize this variation, figures 9-12 describe the Cumulative Match Characteristics (CMC) curves. We remark that some classes possess a high accuracy value for a specific number of images. Other classes present the stability of accuracy value some number of images. In addition, divers classes posses an accuracy of 0 for some number of images. For example, Lifestyle class present an accuracy of 0 for 5, 10, 50, 75, 100 shared images and News class have an accuracy of 0 for 5, 10, 50 images. In fact, the users who have Lifestyle or News as self-assessed class are more likely to have perturbation in their shared images that change the output class to any other target class. In this context, we try to discuss the reasons why some images are miss-classified over classes other than the self-assessed class through the figure 13.
To evaluate our framework, we deliver a continuously growing set of pre-trained models with famous architectures Table 3 The  for the Caffe framework [10]. One focus in our work is the depth of the CNN, which affects the capability of the convolutional layers. Thus, we make use of four different CNN architectures illustrated in table 7 in terms of test accuracy performance. For that, GoogleNet architecture is the best architecture based on the idea of executing the layers in parallel with the inception module. It helps us to get a better classification accuracy by extracting the information about the very fine grain details in the depth. To highlight our framework, from table 8, we can assume that our framework outperforms the literature meth- ods on several levels. We can valorize our work by choosing Facebook as a source of data among other social networks through the implementation of a specific Facebook application [13] compared with other works [19,24,23] that use Pinterest as social source applying the crawling method based on a public APIs. In addition, we apply four CNN architectures for object recognition to enhance the feature extraction and classification modules compared to those of other works. To evaluate the performance of our algorithm, we apply two diverse criteria 1) Precision and 2) Recall illustrated in table 9 and table 10 respectively. We used the Cumulative Match Characteristics (CMC) curve to illustrate the evolution of interests prediction rate with the number of shared images and to compare this rate for each class.

Result's Discussion
In this section we will discuss our obtained results by evaluating the performance of each architecture used for features extraction phase. Each CNN architecture contains two separate modules that are the feature extractor and the classification modules.The performance of any CNN architecture is related to the feature extractor module parameters, especially the number and size of filters and layers. AlexNet is one of the deep convolutional neural network to deal with complex scene classification task. AlexNet has 5 convoluional layers, 3 sub sampling layers and 3 fully connected layers. It use a set of filters with size of 11 * 11, 5 * 5 and 3 * 3 respectively for each convolutional layer. Furthermore, to achieve better performance, the complexity of convolutional neural networks is continually increasing with deeper architectures. The VGG'19 is much deeper than AlexNet with 19 layers including thirteen convolutional layers with filters size of 3 * 3 and 3 fully-connected layers. The use of very small filters sizes capture a set of fine deep visual features from the image input, decrease the parameters number and increase the filters number. The increase of filters number augment the depth of the input image and consequently the depth of the network which presents a critical component for good performance in the users interests classification task based on the complex scene image classification. Given that VGG'19 is based on the filters simplicity and the depth of the network, GoogleNet architecture is one of the first architecture that introduces the idea of executing the layers in parallel with the inception module based on the concatenation operation at different scale. GoogleNet use 9 inception modules with a creative structuring of layers in order to improve performance and computationally efficiency. Hence, GoogleNet help us to get better classification accuracy by extracting information about the very fine grain details in the volume. After having discussed the efficiency of feature extraction phase, we try to analyse the images that are misclassified over classes other than the self-assessed class. This misclassification is caused by divers adversarial attacks in the from of delicate perturbations to each input user' image that conduct our framework to predict incorrect class compared to the self-assessed class. However, the users which have selfassessed classes like Relationship, News, Wellness, Lifestyle and Hobbies are more likely to have perturbations in their shared images that change the output class to any other tar-get class. Several examples of miss-classified users are shown in figure 13. The labels above the selected images present the selfassessed classes for each group of users and the labels below the images are the target classes obtained by our framework. In the first group on the right, we see a set of images shared by users, whose have News as self-assessed class. The classification of these images is based on the objects belong each one obtained by CNN architecture for objects recognition. These objects possess Fashion or Places or People as super class in our UIO ontology. This inference presents a perturbation to predict other target classes other than the selfassessed by the user. In the second group in the middle, we illustrate some images shared by users whose have Relationship as self-assessed    class. The classification of these images product Outdoors and People as target classes. This miss-classified is caused by the fact that the objects belong these shared images have Outdoors or People as super class in our UIO ontology. In fact, the topic Relationship reflects the importance of making friendships with people in order to have a good time together. In the third group on the left, we present divers images shared by users whose have Wellness as self-assessed topic. The fact that these shared images contains objects which have Outdoors or People, these users have Outdoors or People as target classes in our classification.

Conclusion
In this paper, we have examined the correlation between topics of interest in social networks (e.g., in Facebook) and the definition of the confidential area of number of shared images to obtain the best user's interest distribution. A joint novel framework, named DeepVisInterest, was established to predict users' interests from visual data applying mainly the CNN architectures for feature extractor and classification modules. We have introduced novel users' interest model to conceptualize and categorize the 24 topics of interests into semantic representation using the ontology. We have systematically evaluated the proposed framework regarding our VisualDatabase which contains over 24000 images. Our system has shown competitive results compared to other state of the art techniques.