Accelerating your initial literature review, a process for graduate students starting in research.

By Marc-André Léger, Ph.D., MBA, MScA, C.Adm.
Email: marcandre@leger.ca

Abstract

This article presents a simple strategy to accelerate literature reviews. The approach was developed for new graduate students wishing to engage in scientific research with little knowledge of how to perform a systematic search using academic sources and scientific journals on a particular topic. However, it may be useful for many others. The approach was used successfully by a research team to perform literature reviews supported by tools such as Zotero and LitMap, and specialized websites, such as Scopus, Web of Science and Engineering Village.

Keywords

Search, systematic review, literature, LitMap, Zotero

Systematic Review strategy. 4

Step 1: Determine the topic. 4

Step 2: Choose keywords 5

Step 3: Identify limits 8

Step 4: Access library databases. 9

Step 5: identifying and including additional databases 10

Step 6: Export results to Zotero. 11

Step 7: Remove duplicates. 12

Step 8: Triage articles and add keywords 12

Step 9: Create, build, and optimize a literature map. 12

Step 8: Add suggested articles with LitMap. 13

Step 9: Perform a final import of the articles to Zotero. 15

Step 10: Download all the articles 15

Step 11: Perform a review the articles for relevance. 16

Step 12: Analyze the documents. 16

Example of the application of the method. 17

Determination of the research topic: 17

Identification of the initial keywords: 17

Determination of an initial Search expression: 17

Selection of databases. 17

Selection of additional databases of scientific and peer-reviewed material 18

Export results to Zotero: 18

Remove duplicates in Zotero: 18

Triage of documents in Zotero: 18

Identification of additional keywords following the triage: 18

Creation of a literature map. 19

Import the articles added with LitMap back to Zotero. 19

Using the articles resulting from the systematic search. 20

Strategy 1: Automated summaries. 20

Strategy 2: Initial bibliometric analysis. 21

Strategy 3: Advanced bibliometric analysis. 29

Bibliography. 33

Appendix A: articles from the example. 34

Introduction

Many years ago, I was very fortunate to have a high school teacher with a Ph.D. who taught us about methodological approaches. At the time, I had no idea that this is going to be of any importance to me, nor did I have any inclinations of doing a Ph.D. myself one day. Of course, today, I understand how important it is to take a systematic approach to resolving problems and how a scientific method can be used to build up some form of proof as to the validity of the answers I would be provided to resolving these questions. Since, I have taken many research methodologies courses, written a few dissertations, articles and other papers, and introduced students to the scientific method.

Of the many steps in getting started on this path for those new to scientific research, I have noticed that many struggle with how to get started on their initial literature review. This is a critical early step in scientific enquiry that is used to get a grasp on the current state of knowledge on a topic. It is also when many researchers define the scope of the project, identify initial hypothesis, and determine an initial research question. Of course, hypothesis and research questions may evolve further at a later stage in the process but at least with this initial work, researchers have a starting point for discussions with colleagues or a research director, material to use in funding requests, and peer-reviewed sources to start writing a research proposal. Hence, this article is particularly intended to help students get started on their path into scientific research, with the hope that they can rely less on Wikipedia, blogs, and Google when they write the papers that they submit.

Systematic Review strategy

The strategy proposed for conducting an initial literature review is to use available tools and take a simple systematic approach. Using databases and resources available in most university libraries, they can identify reliable, peer-reviewed sources to document the current state of scientific knowledge on their research topic. The next sections present the proposed steps. This starts with choosing a research topic.

Step 1: Determine the topic

A scientific research project starts with a subject to be explored. There are many manners in which this subject can be chosen, from a personal interest of a researcher, a course assignment or to take advantage of a funding opportunity. Nearly all subjects can be valid opportunities for scientific enquiry. In a new or innovative area, the subject can be relatively vague or ill-defined. However, as the field matures and the researcher gains expertise on the topic, it can become quite narrow.

Since this article needs examples, it is necessary to determine a topic. As the principal field of research of the authors of this article is cybersecurity, this is formed the basis for the topic determination for a first example. Therefore, for the first example, the search presented in this article is on the general topic of information security. This is based on a personal interest. Since this is a very broad topic, to make it a bit more realistic, the article will investigate information security applied to operational technologies, those used for the management of critical infrastructures. For readers more familiar with information technologies (IT), the technologies used in organizations to help them manage information, operational technologies (OT) are technologies used to manage infrastructure and industrial equipment by monitoring and controlling them. These OT’s include, but are not limited to, critical infrastructures such as the electrical grid of a country, region, or province, in the case of Canada. In the project used as an example, we are focusing on their use in monitoring and controlling a particular critical infrastructure, the electrical grid providing electrical power to cities in Canada.

At this point it is possible to create a concept map, to help better define the topic before going on to the next. Concept map, such as mind maps, have been very helpful to get a better grasp on a topic and decompose it into core concepts. It is not presented in this article, but there are many good tutorials on how to do these. Concept maps are something performed in class with students to help them. Therefore, the topic is:

Information security of operational technologies in Canada

This is what is used for the next steps as an example.

Step 2: Choose keywords

As mentioned, information security of operational technologies in Canada, is selected as a topic for the project described. Computer security and cybersecurity are also used as synonyms for information security and is added to the initial search. In a real-world scenario, the input of advisors, professors and research team members can contribute to defining the initial keywords. From this topic three main concepts are identified:

Information security, with the synonym’s computer security, cybersecurity
Operational technologies, critical infrastructures
Electrical grid

The first element will help to identify the articles in the general topic. The second and third elements will help to narrow down the results to be more relevant. As well as the recommendations of co-researchers, Google and online thesaurus can be used to identify synonyms, which can help in the initial searches. This may require some trial an error to refine, as is explained later. Table 1 presents an overview of the search results in Polytechnique Montreal’s and Concordia University’s online library search engine for the selected keywords, as well as the results from Google Scholar. Identifying and validating the appropriate keywords, operators (AND, OR, NOT) and combinations thereof, may require multiple trials, errors and improvements. While this will become easier as the researchers gains experience, it may be a long, and relatively tedious, process. There is no magic number of articles needed, as research projects differ. In this case, for a relatively small project, with a relatively small team, an initial number below 1000 articles is targeted. Again, readers need to keep in mind that this is at a very early stage in the project and in the literature search. At the end of the process then number should be much lower, well under 100 in most cases.

Search expression	Results Polytechnique	Results Concordia	Google Scholar
Information security OR computer security OR cybersecurity	430 719	536 309	4 110 000
«Information security» OR «computer security» OR cybersecurity	68 820	87 133	426 000
«Operational technologies» OR «critical infrastructures»	605	3836	17 100
«Operational technologies» AND «critical infrastructures»	0	5	224
(«Information security» OR «computer security» OR cybersecurity) AND («Operational technologies» OR «critical infrastructures»)	790	878	17 800

Table 1: Initial searches

While the results are helpful, it can be observed that the sheer number remains too large to be useful in an initial literature review within the scenarios that have been identified in this article. However, in this case the last query that is used could be appropriate for the intended purpose at this point while adding further limits, as described in the next step. Therefore, the next step will proceed with the following query:

(«Information security» OR «computer security» OR cybersecurity)
AND
(«Operational technologies» OR «critical infrastructures»)

Step 3: Identify limits

Starting from what is done in the previous steps, restrictions to limit the results to scientific articles published in peer-reviewed journals during the last ten years in English are added. This is done as the intention is, at least in part, to assess the current state-of-the-art of knowledge in the research domain of the study. The definition of current is initially seen as going back only 10 years. Since the number of results may still be high, the restriction can also be set to 5 years. Therefore, the final search expression from the previous step is used, («Information security» OR «computer security» OR cybersecurity) AND («Operational technologies» OR «critical infrastructures»), with different restrictions, as shown in table 2.

Search restrictions	Results Polytechnique	Results Concordia	Google Scholar
No restrictions	791	878	17 800
Limited to last 10 years	587	652	16 600
Limited to last 5 years	380	415	10 800
Limited to articles from the last 5 years	291	304	639
Limited to articles in scientific journals from the last 5 years	89	96	N/A
Limited to articles in English, in scientific journals from the last 5 years	55	59	629

Table 2: searches with restrictions

The results are at a volume that appears more reasonable for an initial search. It would seem appropriate to use this as the focus of the literature review for the project. In the next step, further research is done to try to identify the most influential and most cited scientific articles on the topic at hand. On this basis, the search will continue using the following query:

(«Information security» OR «computer security» OR cybersecurity) AND («Operational technologies» OR «critical infrastructures»), limited to articles in English, in scientific journals from the last 5 years.

In this example, adding English as a limit could be omitted, since the previous limits resulted in a number just below the 100 articles that had been identified as a workable limit. However, at a later stage, articles that are not in English will still have to be eliminated if the reviewers are not able to read the articles. But anecdotal evidence shows that publication language is not always reliably determined in the databases.

To help students who might be doing this for a first time, arbitrary limits are mentioned. Students like to have specific numbered goals. What would be recommended for them is a minimum of 50 articles for master’s level research and 100 for Ph.D. level, 200 for a Ph.D. dissertation, multiplied by the number of individuals in the team. These highly subjective limits would only be used as guidance for unexperienced researchers, as experienced researchers should set their own limits in accordance with experience, resources, and time available for the project.

Step 4: Access library databases

Retrieving the documents is done using databases available on the Polytechnique Montréal library and the Concordia University library websites. They are selected as these are the libraries available to the authors of this article. As both universities have different database subscriptions, this allows for additional sources to be identified. However, it may result in many duplicates. The duplicates can be easily resolved later. This is a good strategy for research teams or post-graduate students that often have affiliations to different institutions. As shown in table 2, this resulted in 55 and 59 articles. They are all exported directly into the Zotero reference manager, using a web navigator plugin.

Step 5: identifying and including additional databases

In this next step, databases that aggregate scientific articles or that offer larger data samples are used. This allows to cast a wide net to increase the likelihood of including important literature in the project. In particular, the following databases are used, as they are the best known and most used databases for citations counts:

Scopus.com
Web of Science
Engineering Village
Google Scholar

Scopus

As described online (https://www.elsevier.com/solutions/scopus ), Scopus provides an abstract and citation database for scholarly literature. It provides metrics that is useful to determine the impact of peer-reviewed scientific journal articles. In the context discussed in this article, Scopus is used to identify influential articles on the topic at hand.

Using the query identified in the previous section, («Information security» OR «computer security» OR cybersecurity) AND («Operational technologies» OR «critical infrastructures»), the query in Scopus produced 267 results, that are sorted by number of citations. The top 50 references are exported to a file in Bibtext format. As well, the complete Scopus results can be exported to be used later to perform a bibliometric analysis, described in Strategy 2, later in this article. As is described in the later section, this bibliometric analysis can serve as validation of the relevance of the process and the results. The bibliometric analysis can also provide additional insights into the domain.

Web of Science

Web of science is similar to Scopus. It provides access and queries to a large citations database. However, as it is managed by a different media group, it offers different results to Scopus. The objective of using both is to catch the most cited articles on the topic. As duplicates is removed in a later step, this should limit the effect of any biases created by the different databases. Using the same query as in previous steps produced 103 results, that are sorted by number of citations. The top 50 are exported to a file in Bibtext format. Here as well, the complete Web of Science results can be exported to be used later for a bibliometric analysis.

Engineering Village

Engineering village is a database of scientific journal articles that specializes in the fields of applied science and engineering. It is used to complement the previous searches. The search in this database produced 222 results, sorted by relevance. The top 50 are exported to a file in Bibtext format.

Google Scholar

Google Scholar is a service of the Google search engine that specializes in scientific publications. It is used in this search strategy to complement the previous searches with additional material. The search in this database produced 649 results, that are sorted by relevance. The top 40, which corresponds to the two first pages of results, are exported using the web browser plugin.

Step 6: Export results to Zotero

Using the previous queries, the results from the library searches are imported into Zotero using the Google Chrome plugin. For this purpose, a library named Critical Infrastructure is created in the Zotero account. Zotero is chosen due to the familiarity of the research team with the product and because it is a recommended tool by librarians. However, there are many other similar tools that can be used to achieve the same result. For the Scopus search, it is necessary to export the results from the Scopus website in Bibtext format, adding Abstracts to the default export settings in the export menu on Scopus. This generates a file named Scopus.bib that can then be imported into Zotero. This is done in Zotero in the File – Import menu. A similar process is used for Web of Science and Engineering Village but with different default filenames that are created by the sites. For Google Scholar, the Chrome web browser plugin is used. In this example, for Google Scholar, only the first 40 entries, sorted by relevance, are imported. The number of Google Scholar results that is saved may vary based on available time and resources involved in the project, but a maximum number of 100 would be more than sufficient in most cases.

Step 7: Remove duplicates

After all the references are imported into Zotero, it is necessary to remove duplicates in Zotero. This is required as the results from the different queries will overlap from one database to the other. It is done using a specific remove duplicates function in Zotero. In the example, once this is complete, there are 181 documents remaining.

Step 8: Triage articles and add keywords

The results are then submitted to a first review, looking only at the title of the articles and the abstract. This is done to quickly ensure articles are relevant and ready to be used for the next steps. In the example described here, 20 articles are removed as they did not indicate a link to the research subject. This step is also an opportunity to help identify terms and keywords that may become useful later in the process. These should be noted, as is done in this project in the list presented here:

critical infrastructure
public utilities (power grid, electricity production, nuclear power generation plant, wind turbines, gas distribution network, drinking water production and distribution)
smart cities
Maritime and air transport and shipping
Operational Technologies (OT)
Operational environment
SCADA
industrial controls
Industrial Internet of things (IIoT)
Internet of things (IoT)
cyber-physical systems
Industry 4.0

Step 9: Create, build, and optimize a literature map

From there, a reference mapping tool is used to again try to ensure that all the important references are found and included in the project. The web tool LitMap was chosen for this project (https://app.litmaps.co/ ) and a bibtext file export of the articles remaining after the triage step are imported.

Figure 1: LitMap graph

The LitMap tool then suggests new articles, which are potentially relevant based on what is there. It also allows the research team to get a visual outlook of the literature, helping them to get a better understanding of what is there and helping to identify the evolution of knowledge in this fields, the connections in the literature and significant articles that are more connected to the corpus of knowledge.

Step 8: Add suggested articles with LitMap

Using LitMap, it is possible to identify additional relevant articles that are connected to the journal articles resulting from the previous steps. There may be several factors that come into play as to determining relevance, such as shared references, keywords, authors, and themes. By using the Run Explore feature of LitMap, a list of these suggested articles. By looking at their title and abstract, it can be determined if they should be added. Generally, it would be suggested to add articles that would appear most likely to add value to the work should be added at this point. Articles published at an earlier date than what is determined at step 3 should also be added if they are highly cited and relevant, as they may identify a key source that have a high impact in the research domain. Figure 2 gives an example of the Explore function of LitMap.

Figure 2 : Explore fonction of LitMap

By using this feature of LitMap and refreshing after adding a few articles to the map, it is possible to add many other relevant articles that are highly connected to this map. An example is presented in figure 3. In the search performed in this article, after a few cycles of adding and refreshing, the map grew to 171 articles.

Figure 3 : Example of a connected article

Step 9: Perform a final import of the articles to Zotero

Following the previous step, a group of articles that is used in the literature review are identified. Using the export map to Bibtext functionality in LitMap will produce a file that can then be imported into a new library in Zotero. This library will contain all the articles. Depending on the options selected in the import and if a new library collection is used, it may be necessary to remove duplicates if there are any. A good reason not to delete previous references and proceed to remove duplicates instead may be to take advantage of full-text copies of the articles to be included in the existing Zotero references. Keeping these will save time in the next step, when all the articles are retrieved.

Step 10: Download all the articles

As mentioned, some of the articles are linked directly to Zotero, as they are already included in previous steps. However, for the next steps, it is necessary to have copies of all the articles available. Performing a literature review involves reading the articles, so having a copy of them is an obvious necessity. The process of retrieving the articles may be done in different ways. The recommended way is to add a copy of the article directly into Zotero, using the Add a copy of a file option. This requires a local copy of the PDF file of the article to be downloaded, which can usually be done by selecting the reference and double-clicking on it in Zotero, which opens the article from the library’s database. Do note that finding all the articles may take some time, depending on the number of articles. A ballpark estimate would consider 5 to 10 minutes per missing reference. An PDF icon will appear next to articles that are added. In some cases, it may be necessary to copy the article title to a Google search, which generally makes it possible to find a link to access and download the article.

When saving a local copy of the article, using a standardized pattern will make a later identification possible. Any scheme would be fine if there is consistency throughout the project and the team. After a thorough search, articles that can’t be located, if it is a small number, they should be removed, as they would not be useful for the next steps.

Step 11: Perform a review the articles for relevance

Once all the articles have been downloaded a mode in-depth review can be made to assess their relevance. This step could be done by a team of researchers with the assistance of students. It requires that inclusion and exclusion criterions be identified. At this point there should be enough domain knowledge to make this feasible. If the number of articles is not too large, it might be acceptable to omit this step. As presented later in this article, a strategy that might be considered is to perform a review by looking at the abstracts only to assess relevance for inclusion or exclusion. Then a review can be performed by reading more comprehensive machine-generated summaries. This would be followed by the lecture of the full articles that make it through the process.

Step 12: Analyze the documents

Using the final selection of articles, this step requires one of two strategies: read or automate. Reading involves, as it would imply, that the articles be read, that key information be highlighted and that notes be taken, using Zotero or another tool, such as Microsoft OneNote or another tool that team members are familiar with. Automation would involve using natural language processing (NLP), perhaps by writing Python code for this purpose. Much analysis can also be done with R Studio, applying some document analysis capabilities that are well documented online. Other strategies involve using specialized off-the shelf document analysis tools or bibliometric tools, which can be purchased. This article makes no specific recommendations for this, as there are too many factors at play in determining the best strategy but will present further steps that can be used in an upcoming section. Students or new researchers would be better off to read the articles and prepare notes to learn and experiment the process.

Once all the steps are completed, it becomes possible to use the material for the intended purpose, such as write a literature review or perform further analysis, as presented in an upcoming section of this article. The next section demonstrates an example of the application of this method on a different topic, IT compliance management.

Example of the application of the method

This section presents an example of the application of the process that is described in the previous part of this article. It is done by applying the various steps for a research project on IT compliance management.

Determination of the research topic:

IT compliance management solutions

Identification of the initial keywords:

Information technology
IT
Compliance
Compliance management

Determination of an initial Search expression:

(“IT” OR “Information technology”) AND (“Compliance Management”)

Selection of databases

Databases	Concordia	Polytechnique	Google
Nb results	271	229	17900
10 years	160	136	12300
5 years	71	55	5840
Articles	50	43	236
Scientific journals	34	29	236
English	20	17	226

Selection of additional databases of scientific and peer-reviewed material

Databases	Scopus	Web of science	Engineering Village
Nb results	256	152	324
10 years	147	99	177
5 years	61	45	59
Articles	32	29	17

The scopus.bib and wos.bib files are saved to be used later for a bibliometric evaluation.

Export results to Zotero:

155 documents

Remove duplicates in Zotero:

94 documents left

Triage of documents in Zotero:

Removed unrelated documents: tax compliance, customs risk and compliance, environmental compliance, healthcare related compliance, such as medication or treatment, Occupational Safety and Health Administration (OHSA) compliance
Removed: articles not in English
Removed: articles that did not appear to be at an appropriate level or too basic to be considered scientific.
56 documents left after the triage step

Identification of additional keywords following the triage:

Information security compliance
Cybersecurity compliance
Security policy compliance
Information security policy compliance
Business process compliance
Privacy compliance
Legal and regulatory compliance
Compliance failure

Creation of a literature map

LitMap: https://app.litmaps.com/shared/workspace/C0D77D41-1B1E-4C9E-BB69-A60FF80ACDF2/map/E0999F2B-DC33-4CC7-BC0A-A17D5D21891F

Adding related and connected articles with LitMap:

Imported the bibtext file
Started an initial search for new articles + wait 24 hours
Added additional articles
Ended the process with 72 articles

Import the articles added with LitMap back to Zotero

remove duplicates
70 articles were left at the end of this process

At this point all the documents are merged into a single Zotero collection, a last review for duplicates is performed, all PDF files are located, and article summaries are added when not present. A few articles are removed as it is not possible, after many attempts using different sources, to locate a PDF version of the article. In the end, 107 articles remained for analysis. A list of the remaining articles is presented in appendix A.

Using the articles resulting from the systematic search

The first part of this article presented a method to be used to identify journal articles and scientific literature that can be used in scientific research. The second part presented an example that concerned IT Compliance management. As mentioned, getting started on a literary review for a research project can often be a difficult task for individuals starting in empirical research and new graduate students. These are the prime targets for this article. In the next sections we will propose different tools and strategies to accelerate the process. It should be mentioned that some of the proposed approaches could be misused and produce results that could be considered plagiarism or academic fraud. Any use of the material in a dissertation needs to be discussed with research advisors and ethical concerns investigated. However, the authors of this article believe that using tools to assist in research, can be very beneficial, when done appropriately.

The final part presents a few strategies that can be used to assist in the literature search process. The first strategy proposes to use some tools to automate the creation of expanded text summaries that may be helpful to evaluate the usefulness of documents in more depth that what is provided by author provided summaries. The second and third strategies use R Studio to perform bibliometric analysis of the documents to help gather initial insights into the corpus of knowledge that was assembled, to help accelerate the initial phases of research.

Strategy 1: Automated summaries

To accelerate the review of many articles, tools can be used, as mentioned in step 11. In this article, wordtune read (https://app.wordtune.com/read ) is used to produce this initial analysis. A similar result can also be achieved by using python code with machine learning libraries. However, a quicker approach is privileged using an off-the-shelf solution. With this tool, once all the PDF version of the articles have been located, as presented in step 10, they can simply be dragged-and-dropped from Zotero onto wordtune read to generate an initial summary. This summary can then be copied back into the notes section of Zotero, associated with the article. While an initial selection is made by reading the abstracts, this summary can then be used to perform a further review and selection. Of course, readers need to be reminded that this summary should not be used as-is to create an assignment, an article or material intended for publication.

The process to generate an automated summary:

Select a citation in Zotero
Open the tab
Right-click on the PDF file and select Locate document
Drag-and-drop the PDF file on wordtune read
Wait for the summary to be created
Use the Copy All option in wordtune
Create a new note
Give the note a title, such as wordtune summary, to avoid misuse later
Paste the summary the note

Once summaries of articles are produced, they can be used to perform a second level of review, remembering that the first review is done by reading the author’s abstract, available from the publisher. Using the wordtune produced abstract provides further material to determine the relevance of the article for the study. As well, at this stage, a checklist of inclusion and exclusion criterion can be created to help the process. Eventually, python and NLP could be used to perform a selection based on the summary, should there be too many articles to review manually with the available human resources in the project.

Strategy 2: Initial bibliometric analysis

There are many different bibliometric approaches that can be useful to help get started. Keeping in mind that the primary audience for the authors of this article are in the Sciences, Technology, Engineering and Math (STEM) fields, the use of a statistical analysis tool called R-Studio is proposed. Using text analysis tool can help identify more significant references that can emerge from the documents identified previously. An example, with sample code, is presented. The article does not go into the installation and configuration of R Studio, which can easily be performed using information found online.

Statistical article analysis

The first analysis that is presented in this article consists of using R-Studio to investigate the most significant keywords that can be found in the corpus of documents that is put together from the process described earlier. From these, after generating the automated summaries desca few

The code used is:

# This R script is used to analyse large volumes of PDF files # Created by Dr Marc-André Léger # This version 28 June 2022 # This is the output Excel file nale excel_out <- «words_analysis_102.xlsx» # load the required libraries library(«xlsx») require(pdftools) # reads pdf documents require(tm) # text mining analysys # get all the files files <- list.files(«documents», pattern=»pdf$», full.names=TRUE, recursive=TRUE) opinions <- lapply(files, pdf_text) length(opinions) # make sure how many files are loaded lapply(opinions,length) # and the length in pages of each PDF file # create a PDF database for the wordcloud and the stemmed analysis pdfdatabase <- Corpus(URISource(files),readerControl = list(reader = readPDF)) pdfdatabase <- tm_map(pdfdatabase, removePunctuation, ucp = TRUE) opinions.tdm <- TermDocumentMatrix(pdfdatabase,control = list(removePunctuation = TRUE, stopwords = TRUE, tolower = TRUE, stemming = FALSE, removeNumbers = TRUE, bounds = list(global = c(3,Inf)))) inspect(opinions.tdm[10:20,]) #examine 10 words at a time across documents opinionstemmed.tdm <- TermDocumentMatrix(pdfdatabase,control = list(removePunctuation = TRUE, stopwords = TRUE, tolower = TRUE, stemming = TRUE, removeNumbers = TRUE, bounds = list(global = c(3,Inf)))) inspect(opinionstemmed.tdm[10:20,]) #examine 10 words at a time across documents # prepare the word matrix ft <- findFreqTerms(opinions.tdm, lowfreq = 100, highfreq = Inf) as.matrix(opinions.tdm[ft,]) ft.tdm <- as.matrix(opinions.tdm[ft,]) df <- sort(apply(ft.tdm, 1, sum), decreasing = TRUE) # prepare the word matrix for the word analysis ft2 <- findFreqTerms(opinionstemmed.tdm, lowfreq = 100, highfreq = Inf) as.matrix(opinionstemmed.tdm[ft2,]) ft2.tdm <- as.matrix(opinionstemmed.tdm[ft2,]) df2 <- sort(apply(ft2.tdm, 1, sum), decreasing = TRUE) #print (ft.tdm) # this might be used for debugging #print (df) # this might be used for debugging # save the results output1 <- data.frame(df) output2 <- data.frame(ft.tdm) output3 <- data.frame(df2) output4 <- data.frame(ft2.tdm) # then export them to an Excel file tmp1 <- write.xlsx(output1, excel_out, sheetName = «Articles», col.names = TRUE, row.names = TRUE, append = FALSE) tmp2 <- write.xlsx(output2, excel_out, sheetName = «Words», col.names = TRUE, row.names = TRUE, append = TRUE) tmp3 <- write.xlsx(output3, excel_out, sheetName = «Articles_Stemmed», col.names = TRUE, row.names = TRUE, append = TRUE) tmp4 <- write.xlsx(output4, excel_out, sheetName = «Words_Stemmed», col.names = TRUE, row.names = TRUE, append = TRUE)

This example makes it possible to produce an excel file with the results from the documents that have identified. Table x presents the ten most frequent words from the documents.

Word	Occurrences
compliance	7770
information	3753
security	3475
management	3195
business	3185
process	3185
data	2343
can	2284
research	2218
model	2087

Table x: The ten most frequent words from the documents

From there, further analysis in excel, selecting the most relevant words in stemmed format makes it possible to create insights that will help identify documents that would be likely to bring significant insights to the project. As presented in table x, the results of this inquiry.

Reference	complianc	secur	risk	control	audit	govern	noncompli	cybersecur	Relevance
Hashmi2018d	682	29	43	83	50	22	25	0	934
Akhigbe2019	253	32	2	2	0	19	0	0	308
Ali2021	249	476	9	43	0	7	119	4	907
Rinderle.Ma2022	234	3	19	20	6	1	1	0	284
Castellanos2022	231	15	9	21	4	9	2	6	297
Hashmi2018c	227	4	3	9	11	2	11	0	267
Haelterman2020	222	9	82	55	9	11	1	0	389
Yazdanmehr2020	220	90	5	39	0	2	13	0	369
Cabanillas2020	200	3	2	61	5	5	1	0	277
Usman2020	198	7	3	0	0	3	1	0	212
Mustapha2018	193	2	19	32	2	2	0	0	250
Mustapha2020	190	7	1	26	1	3	1	0	229
Meissner2018	187	6	30	8	3	14	7	0	255
Kim2020	187	4	19	11	1	4	4	0	230
Konig2017	173	71	2	26	3	0	5	0	280
Mubarkoot2021	166	38	7	5	7	16	2	0	241
Gorgon2019	159	7	145	20	8	30	6	0	375
Donalds2020	149	295	13	7	0	3	1	77	545
Cheng2018	143	40	6	38	21	5	2	0	255
Chen2018	138	197	2	44	0	2	1	0	384
Lin2022	132	0	28	2	1	45	17	0	225
Huising2021	129	3	21	18	18	85	2	0	276
Alqahtani2021	129	197	6	2	0	14	4	10	362
Jin2021	125	4	72	4	2	0	1	0	208
Banuri2021	123	2	11	8	21	4	3	0	172
Alotaibi2019	119	199	3	6	0	3	44	0	374
Pathania2019	118	19	1	4	0	0	0	0	142
Asif2019	117	1	6	4	14	27	3	0	172
Hendra2021	112	5	8	3	1	3	3	0	135
Pand2020	112	4	10	0	5	3	72	0	206
Hashmi2018b	110	3	6	3	0	2	1	0	125
Arogundade2020	109	2	13	27	1	1	0	0	153
Niedzela2021	108	7	18	13	9	1	9	0	165
Petersson2021	93	158	9	4	0	4	31	0	299
Rahmouni2021	84	30	5	35	60	6	3	3	226
Nietsch2018	84	4	28	20	3	26	8	0	173
Wang2020	82	2	25	12	284	6	11	0	422
Hanrahan2021	78	51	93	9	0	20	1	0	252
Moody2018	77	245	9	79	0	1	9	0	420
DArcy2019	75	94	2	23	0	1	6	0	201
Corea2020	73	0	0	3	5	3	1	0	85
Cunha2021	69	8	3	6	1	0	1	1	89
Dai2021	68	0	1	38	4	7	5	0	123
Koohang2020	67	104	4	9	0	2	0	0	186
Bussmann2019	67	0	7	27	0	2	1	0	104
Asif2020	64	1	24	4	15	35	13	0	156
Koohang2020	61	112	4	9	0	0	0	1	187
Winter2020	60	3	0	1	0	0	1	0	65
Torre2019	57	9	2	24	1	0	1	0	94
Cammelli2022	50	5	10	5	16	22	19	0	127
Ragulina2019	48	1	1	3	1	10	0	0	64
Barlow2018	46	224	5	9	0	1	11	0	296
Scope2021	45	8	2	4	2	13	1	0	75
Salguero.Caparros2020	45	0	17	6	3	1	14	0	86
Gaur2019	44	2	36	10	2	12	3	0	109
Becker2019	43	2	27	3	2	2	6	0	85
Hashmi2018	40	128	8	16	4	6	0	0	202
Lembcke2019	38	74	2	9	0	0	1	0	124
Painter2019	38	0	23	9	13	17	0	0	100
Norimarna2021	34	0	70	6	2	44	0	1	157
Ophoff2021	34	106	2	0	2	4	56	15	219
Becker2020	30	0	28	9	0	3	0	2	72
Sackmann2018	23	1	0	0	0	0	0	0	24
Pudijanto2021	22	3	21	6	126	17	0	0	195
Culot2021	19	112	23	28	7	33	0	16	238
Pankowska2019	13	7	40	32	0	26	0	0	118
Johannsen2020	10	33	8	2	0	8	0	1	62
Mukhopadhyay2019	10	172	99	16	6	10	1	1	315
Widjaya2019	9	23	4	7	3	28	0	0	74
Hofman2018	6	0	1	1	0	5	3	0	16
Al.Anzi2014	6	62	7	14	2	5	0	0	96
Na2019	2	141	19	4	0	7	0	1	174
Jensen1976	1	17	28	29	2	6	0	0	83
Offerman2017	1	0	0	1	0	6	0	0	8
Costa2016	0	0	0	3	0	0	0	0	3
Alshehri2019	0	0	0	0	0	0	0	0	0
Total	7760	3723	1321	1179	769	747	569	139	16207
Document count	74	64	70	71	47	65	53	14	76

What the data from table x reveals is the significance of certain articles in relation to the research subject, as well as in relation to the different terms of interest for the project. In the table, the first column contains the occurrence of the stem variations on compliance in the articles. This would include compliance and many variations on that word stem. As this is the main topic of our inquiry, it would be quite logical that this is the most frequent term. As well, the document with the highest count of the word compliance also have a high frequency of other keywords that are highly correlated to our research subject. The occurrence of significant keywords is noted in the last column, relevance. This column indicates the relative importance of a particular article for research subject. The combination of high count of the most important keyword for our project and the highest relevance of all the keyword would place this document as having a high potential of being very relevant for our project. It should be a high priority on our reading list for the project.

Wordcloud

Wordclouds present a graphical representation of the most significant words that appear in the corpus of documents. The relative size of the words representing their frequency in all of the documents. The code used is:

# This R script is used to create a wordcloud from PDF files # Created by Dr Marc-André Léger # This version 28 June 2022 # uncomment in this section if not already installed # install.packages(«wordcloud») # install.packages(«RColorBrewer») # install.packages(«wordcloud2») # load the required libraries library(«wordcloud») library(«wordcloud2») library(«RColorBrewer») require(pdftools) # reads pdf documents require(tm) # text mining analysys # get all the files files <- list.files(«documents», pattern=»pdf$», full.names=TRUE, recursive=TRUE) opinions <- lapply(files, pdf_text) length(opinions) # make sure how many files are loaded lapply(opinions,length) # and the length in pages of each PDF file # create a PDF database for the wordcloud and the stemmed analysis pdfdatabase <- Corpus(URISource(files),readerControl = list(reader = readPDF)) pdfdatabase <- tm_map(pdfdatabase, removePunctuation, ucp = TRUE) opinions.tdm <- TermDocumentMatrix(pdfdatabase,control = list(removePunctuation = TRUE, stopwords = TRUE, tolower = TRUE, stemming = FALSE, removeNumbers = TRUE, bounds = list(global = c(3,Inf)))) # prepare the word matrix for the wordcloud ft <- findFreqTerms(opinions.tdm, lowfreq = 100, highfreq = Inf) as.matrix(opinions.tdm[ft,]) ft.tdm <- as.matrix(opinions.tdm[ft,]) freq1 <- apply(ft.tdm, 1, sum) # finally the wordcloud set.seed(1234) wordcloud(words = ft, freq = freq1, min.freq = 10, max.words=200, random.order=FALSE, rot.per=0.35, colors=brewer.pal(8, «Dark2»))

In this example, the un-stemmed version of the words is used to provide more readable results. This can be helpful in presenting the research or for communications on the research topic. Another use of this can be to confirm the choices made in identifying the keywords used for the literature review or to help validate the corpus in relation to the research topic. The wordcloud should show the more frequent words align with the research topic. The result can be seen in figure 4.

Figure 4: Wordmap of the selected corpus

Strategy 3: Advanced bibliometric analysis

Further analysis of the corpus of documents can be performed to gather additional insights into the research subject. Bibliometric analysis allows to better understand the links between the documents, the authors, and the research field. What is proposed is the use of a bibliometric analysis tool called Bibliometrix, available online https://www.bibliometrix.org/home/. Other tools, such as Quanteda can also be used for this purpose. For novice researchers, Bibliometrix have a graphical user interface, called Shiny, that can be used, which is documented online. Some examples of the information that can be extracted from this tool is presented in this article. However, more information is available on how to get all the benefits from this tool.

In the example below the scopus.bib and wos.bib files from step are used. Starting RStudio, the following instructions are used to start BiblioShiny:

library(bibliometrix) # load bibliometrix package

biblioshiny() # start the graphical user interface

Figure 5 shows the graphical user interface with the Scopus data loaded from an earlier example on compliance. This can be done by Data – Load Data. The data can then be used to help validate the information identified. The Overview – Main Information menu will provide a better overview of the data, as is shown on figure 6.

Figure 5: Bibliometrix Shiny graphical interface

Figure 6 is showing that there are 530 different sources, covering a timespan from 1973 to 2022. In earlier analysis, data from 2018 to 2022 is used to focus on recent sources of scientific data on the research topic. What this is showing is that Scopus contains articles from 1973 on this topic. Further investigation, using Google Scholar, will show the evolution of the domain.

Figure 6: overview of the data

A scan of Google Scholar, Scopus and Web of Science citations is presented in figure 7. This indicates that there is a surge in publications around 1999-2000. This would make sense for those familiar with the domain, as anecdotal evidence suggests that there is a significant increase in the interest in the topic of compliance since that period, when a few well know financial scandals brought this topic to the forefront. As well, there is a significant increase in Governance, Risk and Compliance issues since.

By exploiting the data and using the different reports further insights can be gathered. We can see identify the most cited authors, presented in table x. In using the source material, these authors should be included. Of course, the articles need to be reviewed and taken into context, but material from theses authors should be prioritized.

Author	Citations
GOVERNATORI G	172
HINA S	162
DOMINIC DD	161
HASHMI M	146
SOMMESTAD T	103
HALLBERG J	92
KUMAR A	89
RINDERLE-MA S	86
BUKSA I	78
RUDZAJS P	78

Table x shows the most cited articles. Here as well, these articles show a high potential of being very important to this field of enquiry. This should be confirmed by reading the articles, but they need to be included in the next phases of literature review.

Document	Year	Local Citations	Global Citations
BULGURCU B, 2010, MIS QUART MANAGE INF SYST	2010	161	1076
HERATH T, 2009, EUR J INF SYST	2009	120	788
HERATH T, 2009, DECIS SUPPORT SYST	2009	83	534
IFINEDO P, 2012, COMPUT SECUR	2012	78	435
VANCE A, 2012, INF MANAGE	2012	71	417
SIPONEN M, 2014, INF MANAGE	2014	61	290
SADIQ S, 2007, LECT NOTES COMPUT SCI	2007	60	306
PAHNILA S, 2007, PROC ANNU HAWAII INT CONF SYST SCI	2007	59	320
PUHAKAINEN P, 2010, MIS QUART MANAGE INF SYST	2010	57	405
IFINEDO P, 2014, INF MANAGE	2014	57	237

Bibliography

Aria M, Cuccurullo C (2017). “bibliometrix: An R-tool for comprehensive science mapping analysis.” Journal of Informetrics, 11(4), 959-975. https://doi.org/10.1016/j.joi.2017.08.007.

Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. (2018) “quanteda: An R package for the quantitative analysis of textual data”. Journal of Open Source Software. 3(30), 774. https://doi.org/10.21105/joss.00774.

Appendix A: articles from the example

Abbasipour, M., Khendek, F., & Toeroe, M. (2018). Trigger correlation for dynamic system reconfiguration. Proceedings of the ACM Symposium on Applied Computing, 427‑430. https://doi.org/10.1145/3167132.3167383

Afrifah, W., Epiphaniou, G., Ersotelos, N., & Maple, C. (2022). Barriers and opportunities in cyber risk and compliance management for data-driven supply chains.

Akhigbe, O., Amyot, D., & Richards, G. (2019). A systematic literature mapping of goal and non-goal modelling methods for legal and regulatory compliance. Requirements Engineering, 24(4), 459‑481. https://doi.org/10.1007/s00766-018-0294-1

Ali, R. F., Dominic, P. D. D., Ali, S. E. A., Rehman, M., & Sohail, A. (2021). Information security behavior and information security policy compliance : A systematic literature review for identifying the transformation process from noncompliance to compliance. Applied Sciences, 11(8), 3383.

Alotaibi, M. J., Furnell, S., & Clarke, N. (2019). A framework for reporting and dealing with end-user security policy compliance. 27(1), 2‑25. https://doi.org/10.1108/ICS-12-2017-0097

Alqahtani, M., & Braun, R. (2021). Reviewing influence of UTAUT2 factors on cyber security compliance : A literature review. Journal of Information Assurance & Cyber security.

Alshammari, S. T., Alsubhi, K., Aljahdali, H. M. A., & Alghamdi, A. M. (2021). Trust Management Systems in Cloud Services Environment : Taxonomy of Reputation Attacks and Defense Mechanisms. IEEE Access, 9. https://doi.org/10.1109/ACCESS.2021.3132580

Alshehri, F., Kauser, S., & Fotaki, M. (2019). Muslims’ View of God as a Predictor of Ethical Behaviour in Organisations : Scale Development and Validation. Journal of Business Ethics, 158(4), 1009‑1027. https://doi.org/10.1007/s10551-017-3719-8

Antonucci, Y. L., Fortune, A., & Kirchmer, M. (2021). An examination of associations between business process management capabilities and the benefits of digitalization : All capabilities are not equal. Business Process Management Journal, 27(1), 124‑144. https://doi.org/10.1108/BPMJ-02-2020-0079

Arsenijević, O., Podbregar, I., Šprajc, P., Trivan, D., & Ziegler, Y. (2018). The Concept of Innovation of User Roles and Authorizations from View of Compliance Management. ORGANIZACIJA IN NEGOTOVOSTI V DIGITALNI DOBI ORGANIZATION AND UNCERTAINTY IN THE DIGITAL AGE, 747.

Asif, M. (2020). Supplier socioenvironmental compliance : A survey of the antecedents of standards decoupling. Journal of Cleaner Production, 246, 118956.

Asif, M., Jajja, M. S. S., & Searcy, C. (2019). Social compliance standards : Re-evaluating the buyer and supplier perspectives. Journal of Cleaner Production, 227, 457‑471.

Banuri, S. (2021). A Behavioural Economics Perspective on Compliance. Banuri, Sheheryar.

Barlow, J. B., Warkentin, M., Ormond, D., & Dennis, A. R. (2018). Don’t Even Think About It ! The Effects of Antineutralization, Informational, and Normative Communication on Information Security Compliance. Journal of the Association for Information Systems. https://doi.org/10.17705/1JAIS.00506

Becker, M., & Buchkremer, R. (2019). A practical process mining approach for compliance management. 27(4), 464‑478. https://doi.org/10.1108/JFRC-12-2018-0163

Becker, M., Merz, K., & Buchkremer, R. (2020). RegTech—the application of modern information technology in regulatory affairs : Areas of interest in research and practice. Intelligent Systems in Accounting, Finance and Management, 27(4), 161‑167. https://doi.org/10.1002/isaf.1479

Brandis, K., Dzombeta, S., Colomo-Palacios, R., & Stantchev, V. (2019). Governance, risk, and compliance in cloud scenarios. Applied Sciences (Switzerland), 9(2). https://doi.org/10.3390/app9020320

Bussmann, K. D., & Niemeczek, A. (2019). Compliance through company culture and values : An international study based on the example of corruption prevention. Journal of Business Ethics, 157(3), 797‑811.

Cabanillas, C., Resinas, M., & Ruiz-Cortes, A. (2020). A Mashup-based Framework for Business Process Compliance Checking. https://doi.org/10.1109/TSC.2020.3001292

Castellanos Ardila, J. P., Gallina, B., & Ul Muram, F. (2022). Compliance checking of software processes : A systematic literature review. Journal of Software: Evolution and Process, 34(5), e2440.

Chen, X., Chen, L., & Wu, D. (2018). Factors That Influence Employees’ Security Policy Compliance : An Awareness-Motivation-Capability Perspective. Journal of Computer Information Systems. https://doi.org/10.1080/08874417.2016.1258679

Cheng, D. C., Villamarin, J. B., Cu, G., & Lim-Cheng, N. R. (2018). Towards end-to-end continuous monitoring of compliance status across multiple requirements. 9(12), 456‑466. https://doi.org/10.14569/IJACSA.2018.091264

Cheng, D. C., Villamarin, J. B., Cu, G., & Lim-Cheng, N. R. (2019). Towards Compliance Management Automation thru Ontology mapping of Requirements to Activities and Controls. In S. Z. Abidin K.A.Z. Mohd M. (Éd.), Proceedings of the 2018 Cyber Resilience Conference, CRC 2018. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/CR.2018.8626817

Coglianese, C., & Nash, J. (2020). Compliance Management Systems : Do they make a difference? Cambridge Handbook of Compliance (D. Daniel Sokol & Benjamin van Rooij eds., Cambridge University Press, Forthcoming), U of Penn, Inst for Law & Econ Research Paper, 20‑35.

Corea, C., & Delfmann, P. (2020). A Taxonomy of Business Rule Organizing Approaches in Regard to Business Process Compliance. Enterprise Modelling and Information Systems Architectures (EMISAJ). https://doi.org/10.18417/EMISA.15.4

Culot, G., Nassimbeni, G., Podrecca, M., & Sartor, M. (2021). The ISO/IEC 27001 information security management standard : Literature review and theory-based research agenda. The TQM Journal.

Cunha, V. H. C., Caiado, R. G. G., Corseuil, E. T., Neves, H. F., & Bacoccoli, L. (2021). Automated compliance checking in the context of Industry 4.0 : From a systematic review to an empirical fuzzy multi-criteria approach. Soft Computing, 25(8), 6055‑6074.

Dai, F. (2021). Labor control strategy in china : Compliance management practice in the socialist workplace. 21(3), 86‑101.

Daimi, K., & Peoples, C. (2021). Advances in cybersecurity management (Vol. 1‑1 online resource). Springer. https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=2951183

Danielis, P., Beckmann, M., & Skodzik, J. (2020). An ISO-Compliant Test Procedure for Technical Risk Analyses of IoT Systems Based on STRIDE. In A. S. I. Chan W.K. Claycomb B. ,. Takakura H. ,. Yang J. J. ,. Teranishi Y. ,. Towey D. ,. Segura S. ,. Shahriar H. ,. Reisman S. (Éd.), Proceedings—2020 IEEE 44th Annual Computers, Software, and Applications Conference, COMPSAC 2020 (p. 499‑504). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/COMPSAC48688.2020.0-203

D’Arcy, J., & Teh, P.-L. (2019). Predicting employee information security policy compliance on a daily basis : The interplay of security-related stress, emotions, and neutralization. Information & Management. https://doi.org/10.1016/J.IM.2019.02.006

Donalds, C. M., & Osei-Bryson, K.-M. (2020). Cybersecurity compliance behavior : Exploring the influences of individual decision style and other antecedents. International Journal of Information Management. https://doi.org/10.1016/J.IJINFOMGT.2019.102056

Ekanoye, F., & James, O. (2018). Global Market Access Regulations, Compliance Management in Developing Countries : A Brief Case Study of Three African Countries. 2018 IEEE Symposium on Product Compliance Engineering (SPCEB-Boston), 1‑6.

Fdhila, W., Rinderle-Ma, S., Knuplesch, D., & Reichert, M. (2020). Decomposition-based Verification of Global Compliance in Process Choreographies. Proceedings – 2020 IEEE 24th International Enterprise Distributed Object Computing Conference, EDOC 2020, 77‑86. https://doi.org/10.1109/EDOC49727.2020.00019

Gallina, B. (2020). A Barbell Strategy-oriented Regulatory Framework and Compliance Management. Communications in Computer and Information Science, 1251 CCIS, 696‑705. https://doi.org/10.1007/978-3-030-56441-4_52

Gaur, A., Ghosh, K., & Zheng, Q. (2019). Corporate social responsibility (CSR) in Asian firms : A strategic choice perspective of ethics and compliance management. 13(4), 633‑655. https://doi.org/10.1108/JABS-03-2019-0094

Ghiran, A.-M., Buchmann, R. A., & Osman, C.-C. (2018). Security requirements elicitation from engineering governance, risk management and compliance. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10753 LNCS, 283‑289. https://doi.org/10.1007/978-3-319-77243-1_17

Gorgoń, M., Raczkowski, K., & Kraft, F. (2019). Compliance Risk Management in Polish and German Companies. Journal of Intercultural Management, 11(4), 115‑145.

Haelterman, H. (2020). Breaking Silos of Legal and Regulatory Risks to Outperform Traditional Compliance Approaches. European Journal on Criminal Policy and Research, 28(1), 19‑36. https://doi.org/10.1007/s10610-020-09468-x

Hanrahan, P., & Bednall, T. (2021). From Stepping-Stones to Throwing Stones : Officers’ Liability for Corporate Compliance Failures after Cassimatis. Federal Law Review, 49(3), 380‑409.

Hashmi, A., Ranjan, A., & Anand, A. (2018). Security and Compliance Management in Cloud Computing. International Journal of Advanced Studies in Computer Science and Engineering, 7(1), 47‑54.

Hashmi, M., Casanovas, P., & de Koker, L. (2018). Legal compliance through design : Preliminary results of a literature survey. TERECOM2018@ JURIX, Technologies for Regulatory Compliance http://ceur-ws. org, 2309, 06.

Hashmi, M., & Governatori, G. (2018). Norms modeling constructs of business process compliance management frameworks : A conceptual evaluation. Artificial Intelligence and Law, 26(3), 251‑305. https://doi.org/10.1007/s10506-017-9215-8

Hashmi, M., Governatori, G., Lam, H.-P., & Wynn, M. T. (2018). Are we done with business process compliance : State of the art and challenges ahead. Knowledge and Information Systems : An International Journal, 57(1), 79‑133. https://doi.org/10.1007/s10115-017-1142-1

Hendra, R. (2021). Comparative Review of the Latest Concept in Compliance Management & The Compliance Management Maturity Models. RSF Conference Series: Business, Management and Social Sciences, 1(5), 116‑124.

Hofmann, A. (2018). Is the Commission levelling the playing field? Rights enforcement in the European Union. Journal of European Integration, 40(6), 737‑751. https://doi.org/10.1080/07036337.2018.1501368

Huising, R., & Silbey, S. S. (2021). Accountability infrastructures : Pragmatic compliance inside organizations. Regulation & Governance, 15, S40‑S62.

Javed, M. A., Muram, F. U., & Kanwal, S. (2022). Ontology-Based Natural Language Processing for Process Compliance Management. Communications in Computer and Information Science, 1556 CCIS, 309‑327. https://doi.org/10.1007/978-3-030-96648-5_14

Jin, L., He, C., Wang, X., Wang, M., & Zhang, L. (2021). The effectiveness evaluation of system construction for compliance management in the electricity market. IOP Conference Series: Earth and Environmental Science. https://doi.org/10.1088/1755-1315/647/1/012024

Kavitha, D., & Ravikumar, S. (2021). Software Security Requirement Engineering for Risk and Compliance Management.

Kim, S. S. (2020). The « Relatedness » perspective in compliance management of multi-business firms. 30(2), 353‑373. https://doi.org/10.14329/apjis.2020.30.2.353

Koohang, A., Nord, J. H., Sandoval, Z. V., & Paliszkiewicz, J. (2020). Reliability, Validity, and Strength of a Unified Model for Information Security Policy Compliance. Journal of Computer Information Systems. https://doi.org/10.1080/08874417.2020.1779151

Koohang, A., Nowak, A., Paliszkiewicz, J., & Nord, J. H. (2020). Information Security Policy Compliance : Leadership, Trust, Role Values, and Awareness. Journal of Computer Information Systems. https://doi.org/10.1080/08874417.2019.1668738

Kruessmann, T. (2018). The compliance movement in russia : What is driving it? Russian Law Journal, 6(2), 147‑163. https://doi.org/10.17589/2309-8678-2018-6-2-147-163

Labanca, D., Primerano, L., Markland-Montgomery, M., Polino, M., Carminati, M., & Zanero, S. (2022). Amaretto : An Active Learning Framework for Money Laundering Detection. IEEE Access, 10. https://doi.org/10.1109/ACCESS.2022.3167699

Lahann, J., Scheid, M., & Fettke, P. (2019). Utilizing machine learning techniques to reveal VAT compliance violations in accounting data. In N. D. Becker J. (Éd.), Proceedings—21st IEEE Conference on Business Informatics, CBI 2019 (Vol. 1, p. 1‑10). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/CBI.2019.00008

Lembcke, T.-B., Masuch, K., Trang, S., Hengstler, S., Plics, P., & Pamuk, M. (2019). Fostering Information Security Compliance : Comparing the Predictive Power of Social Learning Theory and Deterrence Theory. americas conference on information systems.

Liu, B. (2021). Construction of enterprise compliance management and supervision system based on ADR mechanism in Internet Environment. Proceedings – 2021 International Conference on Management Science and Software Engineering, ICMSSE 2021, 314‑317. https://doi.org/10.1109/ICMSSE53595.2021.00073

Luo, M., Wu, C., & Chen, Y. (2019). Construction of ping an airport’s total risk monitoring indicator system. ICTIS 2019 – 5th International Conference on Transportation Information and Safety, 829‑832. https://doi.org/10.1109/ICTIS.2019.8883586

Meissner, M. H. (2018). Accountability of senior compliance management for compliance failures in a credit institution. Journal of Financial Crime.

Mohamed, A. A., El-Bendary, N., & Abdo, A. (2021). An Essential Intelligent Framework for Regulatory Compliance Management in the Public Sector : The Case of Healthcare Insurance in Egypt. Proceedings of the Computational Methods in Systems and Software, 397‑409.

Moody, G. D., Siponen, M. T., & Pahnila, S. (2018). Toward a Unified Model of Information Security Policy Compliance. Management Information Systems Quarterly. https://doi.org/10.25300/MISQ/2018/13853

Mubarkoot, M., & Altmann, J. (2021a). Software Compliance in different Industries : A Systematic Literature Review.

Mubarkoot, M., & Altmann, J. (2021b). Towards Software Compliance Specification and Enforcement Using TOSCA. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13072 LNCS, 168‑177. https://doi.org/10.1007/978-3-030-92916-9_14

Mukhopadhyay, A., Chatterjee, S., Bagchi, K. K., Kirs, P. J., & Shukla, G. K. (2019). Cyber Risk Assessment and Mitigation (CRAM) Framework Using Logit and Probit Models for Cyber Insurance. Information Systems Frontiers : A Journal of Research and Innovation, 21(5), 997‑1018. https://doi.org/10.1007/s10796-017-9808-5

Mustapha, A. M., Arogundade, O. T., Misra, S., Damasevicius, R., & Maskeliunas, R. (2020). A systematic literature review on compliance requirements management of business processes. International Journal of System Assurance Engineering and Management, 11(3), 561‑576.

Mustapha, A. M., Arogundade, O. T., Vincent, O. R., & Adeniran, O. J. (2018). Towards a compliance requirement management for SMSEs : A model and architecture. 16(1), 155‑185. https://doi.org/10.1007/s10257-017-0354-y

Na, O., Park, L. W., Yu, H., Kim, Y., & Chang, H. (2019). The rating model of corporate information for economic security activities. Security Journal, 32(4), 435‑456. https://doi.org/10.1057/s41284-019-00171-z

Niedzela, L., Kuehnel, S., & Seyffarth, T. (2021). Economic Assessment and Analysis of Compliance in Business Processes : A Systematic Literature Review and Research Agenda.

Nietsch, M. (2018). Corporate illegal conduct and directors’ liability : An approach to personal accountability for violations of corporate legal compliance. Journal of Corporate Law Studies, 18(1), 151‑184. https://doi.org/10.1080/14735970.2017.1365460

Nizan Geslevich Packin. (2018). Regtech, Compliance and Technology Judgement Rule. Chicago-Kent Law Review, 93(1).

Norimarna, S. (2021). Conceptual Review : Compatibility of regulatory requirements of FSA to Insurance industry in Indonesia for Integrated GRC. RSF Conference Series: Business, Management and Social Sciences, 1(5), 105‑115.

Oosthuizen, A., van Vuuren, J., & Botha, M. (2020). Compliance or management : The benefits that small business owners gain from frequently sourcing accounting services. The Southern African Journal of Entrepreneurship and Small Business Management, 12(1). https://doi.org/10.4102/sajesbm.v12i1.330

Ophoff, J., & Renaud, K. (2021). Revealing the Cyber Security Non-Compliance « Attribution Gulf ». hawaii international conference on system sciences. https://doi.org/10.24251/HICSS.2021.552

Ozeer, U. (2021). ϕ comp : An Architecture for Monitoring and Enforcing Security Compliance in Sensitive Health Data Environment. Proceedings – 2021 IEEE 18th International Conference on Software Architecture Companion, ICSA-C 2021, 70‑77. https://doi.org/10.1109/ICSA-C52384.2021.00017

Painter, M., Pouryousefi, S., Hibbert, S., & Russon, J.-A. (2019). Sharing Vocabularies : Towards Horizontal Alignment of Values-Driven Business Functions. Journal of Business Ethics, 155(4), 965‑979. https://doi.org/10.1007/s10551-018-3901-7

Pang, S., Yang, J., Li, R., & Cao, J. (2020). Static Game Models and Applications Based on Market Supervision and Compliance Management of P2P Platform. 2020. https://doi.org/10.1155/2020/8869132

Pankowska, M. (2019). Information technology outsourcing chain : Literature review and implications for development of distributed coordination. Sustainability, 11(5), 1460.

Pathania, A., & Rasool, G. (2019). Investigating power styles and behavioural compliance for effective hospital administration : An application of AHP. International Journal of Health Care Quality Assurance.

Petersson, J., Karlsson, F., & Kolkowska, E. (2021). Information Security Policy Compliance—Eliciting Requirements for a Computerized Software to support Value-Based Compliance Analysis. Computers & Security. https://doi.org/10.1016/J.COSE.2021.102578

Petkevičienė, M. (2021). Compliance management development for C-level management in Lithuanian companies [Master’s Thesis].

Prakash, A. M., He, Q., & Zhong, X. (2019). Incentive-driven post-discharge compliance management for chronic disease patients in healthcare service operations. IISE Transactions on Healthcare Systems Engineering, 9(1), 71‑82. https://doi.org/10.1080/24725579.2019.1567630

Pudjianto, W. (2021). Process mining in governance, risk management, compliance (GRC) and auditing : A systematic literature review. Journal of Theoretical and Applied Information Technology, 99(18).

Ragulina, J. V. (2019). Compliance Approaches and Practices for Increasing Competitiveness of Industrial Enterprises : Current Research and Future Agenda. The International Scientific and Practical Forum “Industry. Science. Competence. Integration”, 903‑909.

Rahmouni, H., Munir, K., Essefi, I., Mont, M., & Solomonides, T. (2021). An Ontology-based Compliance Audit Framework for Medical Data Sharing across Europe. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 18(2), 158‑169. https://doi.org/10.34028/iajit/18/2/4

Ramachandran, G. S., Deane, F., Malik, S., Dorri, A., & Jurdak, R. (2021). Towards Assisted Autonomy for Supply Chain Compliance Management. Proceedings – 2021 3rd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2021, 321‑330. https://doi.org/10.1109/TPSISA52974.2021.00035

Riehle, D. M. (2019). Checking Business Process Models for Compliance – Comparing Graph Matching and Temporal Logic. Lecture Notes in Business Information Processing, 342, 403‑415. https://doi.org/10.1007/978-3-030-11641-5_32

Rinderle-Ma, S., & Winter, K. (2022). Predictive Compliance Monitoring in Process-Aware Information Systems : State of the Art, Functionalities, Research Directions. arXiv preprint arXiv:2205.05446. https://doi.org/10.48550/arXiv.2205.05446

Sackmann, S., Kuehnel, S., & Seyffarth, T. (2018). Using business process compliance approaches for compliance management with regard to digitization : Evidence from a systematic literature review. International Conference on Business Process Management, 409‑425.

Salguero-Caparrós, F., Pardo-Ferreira, M. del C., Martínez-Rojas, M., & Rubio-Romero, J. C. (2020). Management of legal compliance in occupational health and safety. A literature review. Safety science, 121, 111‑118.

Schneider, A., & Mauve, M. (2018). Compliance management for P2P systems. 2017 23rd Asia-Pacific Conference on Communications: Bridging the Metropolitan and the Remote, APCC 2017, 2018-January, 1‑6. https://doi.org/10.23919/APCC.2017.8303961

Scope, N., Rasin, A., Heart, K., Lenard, B., & Wagner, J. (2021). The Life of Data in Compliance Management.

Sothilingam, R., Pant, V., Shahrin, N., & Yu, E. (2021). Towards a Goal-Oriented Modeling Approach for Data Governance. CEUR Workshop Proceedings, 3045, 69‑77.

Sumaryadi, S., & Kusnadi, K. (2021). THE INFLUENCE OF STRATEGIC PLANNING AND PERSONNEL COMPETENCE ON ORGANIZATIONAL PERFORMANCE OF THE TNI MATERIAL FEASIBILITY SERVICE MEDIATED BY COMPLIANCE MANAGEMENT. Journal of Economics, Management, Entrepreneurship, and Business (JEMEB), 1(2), 128‑145.

Surridge, M., Meacham, K., Papay, J., Phillips, S. C., Pickering, J. B., Shafiee, A., & Wilkinson, T. (2019). Modelling compliance threats and security analysis of cross border health data exchange. Communications in Computer and Information Science, 1085, 180‑189. https://doi.org/10.1007/978-3-030-32213-7_14

Tanaka, Y., Kodate, A., & Bolt, T. (2018). Data sharing system based on legal risk assessment. ACM International Conference Proceeding Series. https://doi.org/10.1145/3227696.3227715

Timm, F. (2018). An application design for reference enterprise architecture models. Lecture Notes in Business Information Processing, 316, 209‑221. https://doi.org/10.1007/978-3-319-92898-2_18

Timm, F., & Sandkuhl, K. (2018a). A reference enterprise architecture for holistic compliance management in the financial sector.

Timm, F., & Sandkuhl, K. (2018b). Towards a reference compliance organization in the financial sector. Banking and information technology/Deutsche Ausgabe, 19(2), 38‑48.

Torre, D., Soltana, G., Sabetzadeh, M., Briand, L. C., Auffinger, Y., & Goes, P. (2019). Using Models to Enable Compliance Checking Against the GDPR: An Experience Report. model driven engineering languages and systems. https://doi.org/10.1109/MODELS.2019.00-20

Usman, M., Felderer, M., Unterkalmsteiner, M., Klotins, E., Méndez, D., & Alégroth, E. (2020). Compliance Requirements in Large-Scale Software Development : An Industrial Case Study. product focused software process improvement.

Van Rooij, B., & Fine, A. D. (2019). Preventing corporate crime from within : Compliance management, whistleblowing, and internal monitoring. The Handbook of White-Collar Crime, 229‑245.

Wang, D., Yang, R., & Gao, X. (2021). Data security compliance management and control technology based on scene orchestration. Proceedings – 2021 13th International Conference on Measuring Technology and Mechatronics Automation, ICMTMA 2021, 401‑408. https://doi.org/10.1109/ICMTMA52658.2021.00093

Widjaya, W., Sutedja, I., & Hartono, A. W. (2019). Key aspects of data management framework for early adopter : A systematic literature review.

Winter, K., Aa, H. van der, Rinderle-Ma, S., & Weidlich, M. (2020). Assessing the compliance of business process models with regulatory documents. international conference on conceptual modeling. https://doi.org/10.1007/978-3-030-62522-1_14

Wu, X., & Liang, H. (2020). Exploration Research on the Model of Government Regulation Based on Compliance Management System. 2020 6th International Conference on Information Management (ICIM), 117‑121.

Yazdanmehr, A., Wang, J., & Yang, Z. (2020). Peers matter : The moderating role of social influence on information security policy compliance. Information Systems Journal. https://doi.org/10.1111/ISJ.12271