2019-12-29 關(guān)于投一個(gè)benchmark dataset的文章:
(ICST 2019: BugsJS: a Benchmark of JavaScript Bugs)
JavaScript is a popular programming language that is also error-prone due to its asynchronous, dynamic, and loosely-typed nature. In recent years, numerous techniques have been proposed for analyzing and testing JavaScript applications. However, our survey of the literature in this area revealed that the proposed techniques are often evaluated on different datasets of programs and bugs【這個(gè)感覺(jué)站不住腳】. The lack of a commonly used benchmark limits the ability to perform fair and unbiased comparisons for assessing the efficacy of new techniques【好像也站不住腳妻熊,都在同一個(gè)dataset上比不就好了么,只能說(shuō)沒(méi)有一個(gè)a strong benchmark】. To fill this gap, we propose BugsJS, a benchmark of 453【也不大大呀】 real, manually validated JavaScript bugs from 10 popular JavaScript server-side programs, comprising 444k LOC in total. Each bug is accompanied by its bug report, the test cases that detect it, as well as the patch that fixes it. BugsJS features a rich interface for accessing the faulty and fixed versions of the programs and executing the corresponding test cases, which facilitates conducting highly-reproducible empirical studies and comparisons of JavaScript analysis and testing tools.
2019-12-13 國(guó)家公祭日:
In essence, our study empirically confirms and complements previous research findings (and common sense): Developers (and users) prefer documentation that is correct, complete, up to date, usable, maintainable, readable and useful.
Findings: Table VII shows the impact of using each of the
17 languages on the number of bug fixing commits in a
single-language (denoted as hlanguageiS) and multi-language
(denoted as hlanguageiM) setting. From the table, we can note
that the coefficients of the languages are not always statistically
significant. The statistically significant ones are marked with
one or multiple asterisks. There are 20 of them. For those that
are not statistically significant (i.e., 14 of them), unfortunately
not much conclusion can be drawn.
For some languages, the coefficient for the single-language
setting is significant, while the one for the multi-language
setting is not (four languages: CoffeeScript, Ruby, Erlang,
Haskell). For some other languages, it is the other way around
— the coefficient for the multi-language setting is significant,
while the one for the single-language setting is not (four languages:
C, Go, PHP, Python). For yet other languages, their coefficients
for both settings are not significant (three languages:
C#, JavaScript, Perl). Unfortunately, for such languages (11
languages), we cannot compare the two settings (i.e., singlelanguage
and multi-language), because the coefficient of at
least one of the settings is inconclusive.
Thus, we focus on languages with statistically significant
coefficients for both single and multi-language settings. We
find six languages with statistically significant coefficients:
C++, Objective-C, Java, TypeScript, Clojure, and Scala. For all
of them, we consistently find that their coefficients are larger
when they are used in a multi-language setting. This means
that there is a statistically significant support that using these
languages in a multi-language setting (rather than a singlelanguage
setting) increases bug proneness. The findings for the
other eleven languages do not refute the six languages, because
we can not conclude when coefficients are not statistically
significant.
Six languages including C++, Objective-C, Java, TypeScript,
Clojure, and Scala are more defect prone when they are used
with other languages. The results are inconclusive for the
other eleven languages.
1. in objective terms
2. 【可以作為將來(lái)的精讀文章】Patters of knowledge in API reference Documentation. TSE'13 by Martin. P. Robillard. 這篇文章主要對(duì)API reference Documentation(如jdk和.net的以api name為index的每一個(gè)webpage介紹該api的使用內(nèi)容啥的documentation)進(jìn)行content的分析。
對(duì)內(nèi)容進(jìn)行分析羡棵,主要是想知道一般的api documentation中包含了哪些內(nèi)容贺奠,其實(shí)如何組織的己单。具體的就是作者們花了大工夫先定義好了12中knowledge type(如api的功能是什么赛不,該如何使用等)睬辐,隨后分析了這些knowledge type在documentation中的分布按照type vs. method, classes vs. interface and member vs. variable的形式來(lái)進(jìn)行統(tǒng)計(jì)疗垛。同時(shí)輔助于frequent itemsets mining(使用R中的arules進(jìn)行統(tǒng)計(jì)的)症汹。
全文在寫(xiě)作或者實(shí)驗(yàn)方面:對(duì)最重要的第一步定義knowledge type的整個(gè)過(guò)程完全可以好好學(xué)習(xí)。對(duì)后面幾個(gè)較直觀(guān)的RQ的統(tǒng)計(jì)分析也是較常規(guī)的方法贷腕。對(duì)自己工作的意義方面寫(xiě)得較合理有說(shuō)服力背镇。 值得學(xué)習(xí)!
如果做類(lèi)似這樣的工作泽裳,里面的方法值得借鑒瞒斩!
------en...support evidence:
All newly developed applications have?bugs—some of them are quite?difficult to locate?because they exist within the coding logic, some are simply a matter of not