什么是Fake data
Fake data顧名思義假數(shù)據(jù),是在真實產(chǎn)品數(shù)據(jù)無法使用的情況下四苇,產(chǎn)生地接近于產(chǎn)品環(huán)境的數(shù)據(jù),多用于開發(fā)和測試。
Fake data的使用場景
有哪些開發(fā)或測試場景會使用fake data兑徘?
- 當你需要開發(fā)一個UI原型,但是API還沒開發(fā)完成繼而無法獲取相關數(shù)據(jù)來顯示到前端羡洛,這個時候挂脑,就可以使用mock data來模擬API,從而不阻礙UI的開發(fā)工作且使UI和API的開發(fā)并行欲侮,也有可能提早發(fā)現(xiàn)一些問題
- 當需要產(chǎn)生大量的數(shù)據(jù)填充數(shù)據(jù)庫的時候崭闲,可以使用自動化填充接近于產(chǎn)品數(shù)據(jù)的fake data到數(shù)據(jù)庫來滿足開發(fā)測試需求
- 當需要大量類產(chǎn)品環(huán)境數(shù)據(jù)進行壓力測試的時候
- 單元測試需要產(chǎn)生dummy data的時候
Fake data的原則
除了刻意設計的破壞性的test data,我們需要的test data應該是接近于產(chǎn)品環(huán)境和現(xiàn)實生活的威蕉,而不是固定的搭配刁俭。接近于產(chǎn)品數(shù)據(jù)的fake data能夠更好地揭露產(chǎn)品環(huán)境潛在的問題,讓產(chǎn)品看起來具有真實的使用價值和意義韧涨。
Fake data的實現(xiàn)方式
在我目前工作的項目中牍戚,需要填寫各種各樣的表單侮繁,這些表單收集中不同的用戶數(shù)據(jù),如果每一次我都填不同的數(shù)據(jù)來測試如孝,是不是會更接近于產(chǎn)品的真實使用情況宪哩,說不定還能挖掘出一些潛在問題。產(chǎn)品環(huán)境的數(shù)據(jù)由于安全隱私不能觸碰第晰,那么如何產(chǎn)生大量的假數(shù)據(jù)呢锁孟?主要有下面兩種
- 在線服務。比如mock茁瘦、fakename品抽、randomuser和randomapi
- 各類編程語言的庫
這篇文章會介紹Python的四個用于產(chǎn)生fake data的module
- lipsum - is a simple Lorem Ipsum generator library which can be used in your Python applications
- radar - Random date generation
- mimesis - is a fast and easy to use library for Python programming language, which helps generate mock data for a variety of purposes in a variety of languages
- Faker - is a Python package that generates fake data for you
安裝Python3
在開始之前,先升級下python吧甜熔,官方都說了
Python 2.x is legacy, Python 3.x is the present and future of the language
況且很多流行庫比如numpy
都會不在繼續(xù)維護python2圆恤,繼而開始在python3上開發(fā)維護。那還有什么理由堅持python2呢腔稀?
想看python2還有多久退休哑了,請參考這里。
我現(xiàn)在的Python開發(fā)環(huán)境還是macOS自帶python 2.7.10
烧颖,所以需要通過Homebrew去安裝python3.具體的教程可以參考這里還有這里弱左。
┌─[diyu@CNdiyu] - [~] - [Wed Jan 10, 16:14]
└─[$] <> python3 --version
Python 3.6.4
大功告成!
Lorem Ipsum 亂數(shù)假文
lipsum是一個隨機文本語句和片段生成器炕淮。生成的文本有意義的lorem ipsum文本拆火。
代碼非常簡單
import lipsum
print("generate 10 words")
print(lipsum.generate_words(10))
print("*" * 50)
print("generate 3 sentences")
for x in lipsum.generate_sentences(3).split('.'):
print(x.strip())
print("*" * 50)
print("generate 3 paras")
for x in lipsum.generate_paragraphs(3).split('\n'):
print(x)
輸出為
generate 10 words
Quae cum dixissem, magis ut illum provocarem quam ut ips!
**************************************************
generate 3 sentences
Hunc vos beatum; ratio quidemvestra sic cogit
At ego quem huic anteponam non audeo dicere;dicet pro me ipsa virtus necdubitabit isti vestro beato M
Regulumanteponere, quem quidem, cum sua voluntate, nulla vi coactuspraeter fidem, quamdederat hosti, ex patria Karthaginemrevertisset, tum ipsum, cum vigiliis et fame cruciaretur, clamatvirtus beatioremfuisse quam potantem in rosa Thorium
**************************************************
generate 3 paras
Atque haec quidem de rerum nominibus. de ipsis rebus autem saepenumero, Brute, vereor ne reprehendar, cum haec ad te
scribam, qui cum in philosophia, tum in optimo genere philosophiae tantum processeris. quod si facerem quasi te
erudiens, iure reprehenderer. sed ab eo plurimum absum neque, ut ea cognoscas, quae tibi notissima sunt, ad te mitto,
sed quia facillime in nomine tuo adquiesco, et quia te habeo aequissimum eorum studiorum, quae mihi communia tecum sunt,
existimatorem et iudicem. attendes igitur, ut soles, diligenter eamque controversiam diiudicabis, quae mihi fuit cum
avunculo tuo, divino ac singulari viro. nam in Tusculano cum essem vellemque e bibliotheca pueri Luculli quibusdam
libris uti, veni in eius villam, ut eos ipse, ut solebam, depromerem. quo cum venissem, M. Catonem, quem ibi esse
nescieram, vidi in bibliotheca sedentem multis circumfusum Stoicorum libris. erat enim, ut scis, in eo aviditas legendi,
nec satiari poterat, quippe qui ne reprehensionem quidem vulgi inanem reformidans in ipsa curia soleret legere saepe,
dum senatus cogeretur, nihil operae rei publicae detrahens. quo magis tum in summo otio maximaque copia quasi helluari
libris, si hoc verbo in tam clara re utendum est, videbatur. quod cum accidisset ut alter alterum necopinato
videremus, surrexit statim. deinde prima illa, quae in congressu solemus: Quid tu, inquit, huc? a villa enim, credo, et:
Si ibi te esse scissem, ad te ipse venissem.
Heri, inquam, ludis commissis ex urbe profectus veni ad vesperum. causa autem fuit huc veniendi ut quosdam hinc libros
promerem. et quidem, Cato, hanc totam copiam iam Lucullo nostro notam esse oportebit; nam his libris eum malo quam
reliquo ornatu villae delectari. est enim mihi magnae curae - quamquam hoc quidem proprium tuum munus est - ut ita
erudiatur, ut et patri et Caepioni nostro et tibi tam propinquo respondeat. laboro autem non sine causa; nam et avi eius
memoria moveor - nec enim ignoras, quanti fecerim Caepionem, qui, ut opinio mea fert, in principibus iam esset, si
viveret - et Lucullus mihi versatur ante oculos, vir cum virtutibus omnibus excellens, tum mecum et amicitia et omni
voluntate sententiaque coniunctus.
Praeclare, inquit, facis, cum et eorum memoriam tenes, quorum uterque tibi testamento liberos suos commendavit, et
puerum diligis. quod autem meum munus dicis non equidem recuso, sed te adiungo socium. addo etiam illud, multa iam mihi
dare signa puerum et pudoris et ingenii, sed aetatem vides.
類似loripsum.net這樣網(wǎng)站也提供在線生成服務。
radar 隨機日期生成
radar用來生成時間非常方便涂圆。
代碼也非常簡單
import radar
import datetime
#隨機日期
print(radar.random_date())
#隨機日期+時間
print(radar.random_datetime())
#隨機時間
print(radar.random_time())
#指定范圍隨機日期
print(radar.random_date(
start=datetime.datetime(year=1985, month=1, day=1),
stop=datetime.datetime(year=1989, month=12, day=30)))
#指定范圍隨機日期+時間
print(radar.random_datetime(
start=datetime.datetime(year=1985, month=1, day=1),
stop=datetime.datetime(year=1989, month=12, day=30)))
#指定范圍隨機時間
print(radar.random_time(
start="2018-01-10T09:00:10",
stop="2018-01-10T18:00:00"))
#radar默認使用python-dateutil庫來解析日期们镜,但是這個庫非常heavy,可以選擇使用輕量級的radar.utils.parse(快5倍)
print(radar.random_datetime(
start="2018-01-10T09:00:10",
stop="2018-01-10T18:00:00",
parse=radar.utils.parse))
#radar.utils.parse usage
start = radar.utils.parse('2018-01-01')
stop = radar.utils.parse('2018-01-05')
print(radar.random_datetime(start=start, stop=stop))
輸出為
2011-07-01
1997-10-25 16:59:15
04:45:21
1985-01-21 18:16:57
1988-06-27 02:49:24
12:49:16
2018-01-10 16:26:11
2018-01-02
Mimesis 產(chǎn)生mock data
mimesis提供了各類各樣數(shù)據(jù)润歉。這些數(shù)據(jù)涉及到十幾種真實使用場景模狭,比如Dummy data about transport (truck model, car etc.), Personal data (name, surname, age, email etc.), Payment data (credit_card, credit_card_network etc.)。
使用Mimesis
首先要確定locale踩衩,Mimesis支持多達33種不同的語言嚼鹉,下面列子展示了德文和中文數(shù)據(jù)。
from mimesis import Personal
person_en = Personal('en')
print(person_en.full_name())
print(person_en.age())
print(person_en.favorite_movie())
print("*" * 20)
person_zh = Personal('zh')
print(person_zh.full_name())
print(person_zh.age())
print(person_zh.favorite_movie())
輸出為
Karoline Schneider
33
21
********************
香茗 米
22
星際迷航3:超越星辰
本文都以介紹英文為主
之前提到Mimesis
提供多個不同的data provider來產(chǎn)生不同類別的數(shù)據(jù)驱富,下面介紹一些常用的provider以及基本使用方法锚赤。更多的providers請參考官方網(wǎng)站。
from mimesis import Personal, Address, Business, Payment, Text, Food
from mimesis.enums import Gender
person = Personal('en')
#可以傳遞性別給full_name()
print(person.full_name(Gender.MALE))
print(person.level_of_english())
print(person.nationality())
print(person.work_experience())
print(person.political_views())
print(person.worldview())
#自定義名字pattern
# pattern 可以有 ('U-d', 'U.d', 'UU-d', 'UU.d', 'UU_d', 'U_d', 'Ud', 'l-d', 'l.d', 'l_d', 'ld', 'default')
templates = ['l-d', 'U-d']
for item in templates:
print(person.username(template=item))
address1 = Address('en')
print(address1.coordinates())
print(address1.city())
business1 = Business('en')
print(business1.company())
print(business1.company_type())
payment1 = Payment('en')
print(payment1.paypal())
print(payment1.credit_card_expiration_date())
#mimesis也可以生成文字
text1 = Text('en')
print(text1.alphabet())
print(text1.answer())
print(text1.quote())
print(text1.title())
print(text1.word())
print(text1.words())
print(text1.sentence())
food1 = Food('en')
print(food1.drink())
print(food1.fruit())
print(food1.spices())
Generic(*args, **kwargs)
方法提供了統(tǒng)一的接口褐鸥,所有的provider都可以從這個方法進入
from mimesis import Generic
g = Generic('en')
print(g.food.fruit())
print(g.address.postal_code())
如果你想使用自己的數(shù)據(jù)线脚,想定制化一下,可以自定義類,通過類屬性和方法輸出數(shù)據(jù)浑侥。
g1 = Generic('en')
class oneProvider(BaseProvider):
name = "dante"
class Meta:
name = "oneprovider"
def get_age(self):
return "31"
g1.add_provider(oneProvider)
print(g1.oneprovider.get_age())
print(g1.oneprovider.name)
如果不想一個一個生成數(shù)據(jù)姊舵,而是想依據(jù)schema批量生成多個,可以使用Field對象和Schema對象完成寓落。
field = Field('en')
body = (
lambda: {
#field里面的是API名稱
"name" : field('full_name', gender=Gender.FEMALE),
"age" : field('age'),
"email" : field('email'),
"occupation" : field('occupation')
}
)
schema = Schema(schema=body)
print(schema.create(iterations=1))
Faker 產(chǎn)生fake data
Faker的使用和Mimesis
很類似括丁。
Faker
支持多語言。比如下面的列子就會輸出默認的en_US
和中文zh_CN
from faker import Faker
fake = Faker()
print(fake.name())
print(fake.address())
print(fake.city())
print("*" * 20)
fake = Faker('zh_CN')
print(fake.name())
print(fake.address())
print(fake.city())
輸出為
eggy Wood
17796 Johnson Fork Apt. 744
Donaldhaven, DC 41460-2738
Cannonland
********************
向鵬
廣西壯族自治區(qū)梅縣大興夏路w座 617055
杭州市
Faker
提供多個不同的data provider來產(chǎn)生不同類別的數(shù)據(jù)零如,下面介紹一些常用的provider以及基本使用方法。
from faker import Faker
fake = Faker()
# lorem ipsum
print(fake.word())
print(fake.text())
print(fake.paragraphs(5))
print(fake.company())
print(fake.credit_card_full())
print(fake.address())
print(fake.phone_number())
print(fake.date() + ' ' + fake.time())
print(fake.profile())
faker.providers
支持創(chuàng)建自定義的provider
from faker import Faker
from faker.providers import BaseProvider
fake = Faker()
class oneProvider(BaseProvider):
def hello(self):
return "I am one provider"
fake.add_provider(oneProvider)
print(fake.hello())
Faker
提供了命令行工具也非常方便锄弱,具體使用方法和可用參數(shù)考蕾,請參考官方文檔
└─[0] faker -h
usage: faker [-h] [--version] [-o output] [-l LOCALE] [-r REPEAT] [-s SEP]
[-i [INCLUDE [INCLUDE ...]]]
[fake] [fake argument [fake argument ...]]
faker version 0.8.8
positional arguments:
fake name of the fake to generate output for (e.g. profile)
fake argument optional arguments to pass to the fake (e.g. the
profile fake takes an optional list of comma separated
field names as the first argument)
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
-o output redirect output to a file
-l LOCALE, --lang LOCALE
specify the language for a localized provider (e.g.
de_DE)
-r REPEAT, --repeat REPEAT
generate the specified number of outputs
-s SEP, --sep SEP use the specified separator after each output
-i [INCLUDE [INCLUDE ...]], --include [INCLUDE [INCLUDE ...]]
list of additional custom providers to user, given as
the import path of the module containing your Provider
class (not the provider class itself)
supported locales:
ar_AA, ar_EG, ar_JO, ar_PS, ar_SA, bg_BG, bs_BA, cs_CZ, de_AT, de_DE, dk_DK, el_GR, en, en_AU, en_CA, en_GB, en_TH, en_US, es, es_ES, es_MX, et_EE, fa_IR, fi_FI, fr_CH, fr_FR, he_IL, hi_IN, hr_HR, hu_HU, id_ID, it_IT, ja_JP, ka_GE, ko_KR, la, lt_LT, lv_LV, ne_NP, nl_BE, nl_NL, no_NO, pl_PL, pt_BR, pt_PT, ru_RU, sk_SK, sl_SI, sv_SE, th_TH, tr_TR, tw_GH, uk_UA, zh_CN, zh_TW
faker can take a locale as an argument, to return localized data. If no
localized provider is found, the factory falls back to the default en_US
locale.
examples:
$ faker address
968 Bahringer Garden Apt. 722
Kristinaland, NJ 09890
$ faker -l de_DE address
Samira-Niemeier-Allee 56
94812 Biedenkopf
$ faker profile ssn,birthdate
{'ssn': u'628-10-1085', 'birthdate': '2008-03-29'}
$ faker -r=3 -s=";" name
Willam Kertzmann;
Josiah Maggio;
Gayla Schmitt;
使用seed()
可以重現(xiàn)之前的隨機數(shù)據(jù),這樣的話会宪,每次運行代碼就會產(chǎn)生一樣的數(shù)據(jù)肖卧。下面的代碼片段會產(chǎn)生一樣的結果,而不是每次都是隨機數(shù)據(jù)掸鹅。
from faker import Faker
fake = Faker()
fake.seed(9527)
print(fake.name())