聚類—Finding Related Posts

Preprocessing—similarity measured as similar number of common words 步驟

Firstly, tokenizing the text.
Throwing away words that occur way too often to be of any help in
detecting relevant posts.
Throwing away words that occur so seldom that there is only a small
chance that they occur in future posts.
Counting the remaining words.
Calculating TF-IDF values from the counts, considering the whole.
text corpus.

代碼測試

數據介紹

Post filenames	Post content
01.txt	This is a toy post about machine learning. Actually, it containsnot much interesting stuff.
02.txt	Imaging databases can get huge.
03.txt	Most imaging databases safe images permanently.
04.txt	Imaging databases store images.
05.txt	Imaging databases store images. Imaging databases store images. Imaging databases store images.

Python 代碼

import os
import sys

import scipy as sp

from sklearn.feature_extraction.text import CountVectorizer

from utils import DATA_DIR

TOY_DIR = os.path.join(DATA_DIR, "toy")
posts = [open(os.path.join(TOY_DIR, f)).read() for f in os.listdir(TOY_DIR)]    // 打開數據文檔


new_post = "imaging databases"

import nltk.stem
english_stemmer = nltk.stem.SnowballStemmer('english')

// Extending the vectorizer with NLTK's stemmer
class StemmedCountVectorizer(CountVectorizer):
    def build_analyzer(self):
        analyzer = super(StemmedCountVectorizer, self).build_analyzer()
        return lambda doc: (english_stemmer.stem(w) for w in analyzer(doc))

# vectorizer = CountVectorizer(min_df=1, stop_words='english',
# preprocessor=stemmer)
vectorizer = StemmedCountVectorizer(min_df=1, stop_words='english')

from sklearn.feature_extraction.text import TfidfVectorizer
// Extending the vectorizer with NLTK's stemmer
class StemmedTfidfVectorizer(TfidfVectorizer):

    def build_analyzer(self):
        analyzer = super(StemmedTfidfVectorizer, self).build_analyzer()
        return lambda doc: (english_stemmer.stem(w) for w in analyzer(doc))

vectorizer = StemmedTfidfVectorizer(
    min_df=1, stop_words='english', decode_error='ignore')

X_train = vectorizer.fit_transform(posts)

num_samples, num_features = X_train.shape
print("#samples: %d, #features: %d" % (num_samples, num_features))

#samples: 5, #features: 17

new_post_vec = vectorizer.transform([new_post])

// Return the counter vectors
print(new_post_vec, type(new_post_vec))

(0, 5) 0.7071067811865476
(0, 4) 0.7071067811865476 <class 'scipy.sparse.csr.csr_matrix'>

// Return the full ndarray()
print(new_post_vec.toarray())

[[0. 0. 0. 0. 0.70710678 0.70710678 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ]]

// The following words that have been tokenized
print(vectorizer.get_feature_names())
``` python
> ['actual', 'capabl', 'contain', 'data', 'databas', 'imag', 'interest', 'learn', 'machin', 'perman', 'post', 'provid', 'save', 'storag', 'store', 'stuff', 'toy']

//計算向量的相似性
def dist_raw(v1, v2):
    delta = v1 - v2
     //計算向量的范數
    return sp.linalg.norm(delta.toarray())


def dist_norm(v1, v2):
    v1_normalized = v1 / sp.linalg.norm(v1.toarray())  //標準化處理
    v2_normalized = v2 / sp.linalg.norm(v2.toarray())  //標準化處理

    delta = v1_normalized - v2_normalized

    return sp.linalg.norm(delta.toarray())

dist = dist_norm

best_dist = sys.maxsize
best_i = None

for i in range(0, num_samples):
    post = posts[i]
    if post == new_post:
        continue
    post_vec = X_train.getrow(i)
    d = dist(post_vec, new_post_vec)

    print("=== Post %i with dist=%.2f: %s" % (i, d, post)) //打印各text與new_post的文檔比較結果

=== Post 0 with dist=1.41: This is a toy post about machine learning. Actually, it contains not much interesting stuff.
=== Post 1 with dist=1.08: Imaging databases provide storage capabilities.
=== Post 2 with dist=0.86: Most imaging databases save images permanently.

=== Post 3 with dist=0.92: Imaging databases store data.
=== Post 4 with dist=0.92: Imaging databases store data. Imaging databases store data. Imaging databases store data.=== Post 0 with dist=1.41: This is a toy post about machine learning. Actually, it contains not much interesting stuff.
=== Post 1 with dist=1.08: Imaging databases provide storage capabilities.
=== Post 2 with dist=0.86: Most imaging databases save images permanently.
=== Post 3 with dist=0.92: Imaging databases store data.
=== Post 4 with dist=0.92: Imaging databases store data. Imaging databases store data. Imaging databases store data.

    if d < best_dist:
        best_dist = d
        best_i = i
//找出與new_post最相近的text，以及對應的差異度
print("Best post is %i with dist=%.2f" % (best_i, best_dist))

Best post is 2 with dist=0.86

小結

通過建立這種簡單而又高效的模型帮孔，可以將文本轉換為簡潔的“特征向量”濒翻，然后再判斷、比較文本差異怪嫌，但是這個模型也些個不足，具體直接引用原文說明：

It does not cover word relations. With the previous vectorization
approach, the text "Car hits wall" and "Wall hits car" will both have
the same feature vector.
It does not capture negations correctly.For instance, the text "I will
eat ice cream" and "I will not eat ice cream" will look very similar by
means of their feature vectors, although they contain quite the
opposite meaning. This problem, however, can be easily changed by
not only counting individual words, also called unigrams, but also
considering bigrams (pairs of words) or trigrams (three words in a
row).
It totally fails with misspelled words. Although it is clear to the
readers that "database" and "databas" convey the same meaning,
our approach will treat them as totally different words.

聚合 Clustering

來自王天一老師的《人工智能基礎課：物以類聚柳沙，人以群分：聚類分析》
聚類分析是一種無監(jiān)督學習方法岩灭，其目標是學習沒有分類標記的訓練樣本，以揭示數據的內在性質和規(guī)律赂鲤。具體來說噪径，聚類分析要將數據集劃分為若干個互不相交的子集，每個子集中的元素在某種度量之下都與本子集內的元素具有更高的相似度数初。
用這種方法劃分出的子集就是“聚類”（或稱為“簇”）找爱，每個聚類都代表了一個潛在的類別。分類和聚類的區(qū)別也正在于此：分類是先確定類別再劃分數據泡孩；聚類則是先劃分數據再確定類別车摄。
聚類分析本身并不是具體的算法，而是要解決的一般任務仑鸥，從名稱就可以看出這項任務的兩個核心問題：一是如何判定哪些樣本屬于同一“類”吮播，二是怎么讓同一類的樣本“聚”在一起。
解決哪些樣本屬于同一“類”的問題需要對相似性進行度量眼俊。無論采用何種劃定標準意狠，聚類分析的原則都是讓類內樣本之間的差別盡可能小，而類間樣本之間的差別盡可能大泵琳。度量相似性最簡單的方法就是引入距離測度摄职，聚類分析正是通過計算樣本之間的距離來判定它們是否屬于同一個“類”。根據線性代數的知識获列，如果每個樣本都具有 N 個特征谷市，那就可以將它們視為 N維空間中的點，進而計算不同點之間的距離击孩。
作為數學概念的距離需要滿足非負性（不小于 0）迫悠、同一性（任意點與其自身之間的距離為 0）、對稱性（交換點的順序不改變距離）巩梢、直遞性（滿足三角不等式）等性質创泄。在聚類分析中常用的距離是“閔可夫斯基距離”艺玲，其定義為

公式.jpg

式中的 p是個常數。當 p=2時鞠抑，閔可夫斯基距離就變成了歐式距離饭聚，也就是通常意義上的長度。

K-mean 算法

來自王天一老師的《人工智能基礎課：物以類聚搁拙，人以群分：聚類分析》
K 均值算法是典型的原型聚類算法秒梳，它將聚類問題轉化為最優(yōu)化問題。具體做法是先找到 k個聚類中心箕速，并將所有樣本分配給最近的聚類中心酪碘，分配的原則是讓所有樣本到其聚類中心的距離平方和最小。顯然盐茎，距離平方和越小意味著每個聚類內樣本的相似度越高兴垦。但是這個優(yōu)化問題沒有辦法精確求解，因而只能搜索近似解字柠。kk 均值算法就是利用貪心策略探越，通過迭代優(yōu)化來近似求解最小平方和的算法。
K 均值算法的計算過程非常直觀窑业。首先從數據集中隨機選取 k個樣本作為 k個聚類各自的中心扶关，接下來對其余樣本分別計算它們到這 k個中心的距離，并將樣本劃分到離它最近的中心所對應的聚類中数冬。當所有樣本的聚類歸屬都確定后，再計算每個聚類中所有樣本的算術平均數搀庶，作為聚類新的中心拐纱，并將所有樣本按照 k個新的中心重新聚類。這樣哥倔，“取平均 - 重新計算中心 - 重新聚類”的過程將不斷迭代秸架，直到聚類結果不再變化為止。
大多數K均值類型的算法需要預先指定聚類的數目 k咆蒿，這是算法為人詬病的一個主要因素东抹。此外，由于算法優(yōu)化的對象是每個聚類的中心沃测，因而 K均值算法傾向于劃分出相似大小的聚類缭黔，這會降低聚類邊界的精確性。

K-mean例程

"""
====================================
Demonstration of k-means assumptions
====================================

This example is meant to illustrate situations where k-means will produce
unintuitive and possibly unexpected clusters. In the first three plots, the
input data does not conform to some implicit assumption that k-means makes and
undesirable clusters are produced as a result. In the last plot, k-means
returns intuitive clusters despite unevenly sized blobs.
"""
print(__doc__)

# Author: Phil Roth <mr.phil.roth@gmail.com>
# License: BSD 3 clause

import numpy as np
import matplotlib.pyplot as plt

from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs

plt.figure(figsize=(12, 12))

n_samples = 1500
random_state = 170
X, y = make_blobs(n_samples=n_samples, random_state=random_state)

# Incorrect number of clusters
y_pred = KMeans(n_clusters=2, random_state=random_state).fit_predict(X)

plt.subplot(221)
plt.scatter(X[:, 0], X[:, 1], c=y_pred)
plt.title("Incorrect Number of Blobs")

# Anisotropicly distributed data
transformation = [[0.60834549, -0.63667341], [-0.40887718, 0.85253229]]
X_aniso = np.dot(X, transformation)
y_pred = KMeans(n_clusters=3, random_state=random_state).fit_predict(X_aniso)

plt.subplot(222)
plt.scatter(X_aniso[:, 0], X_aniso[:, 1], c=y_pred)
plt.title("Anisotropicly Distributed Blobs")

# Different variance
X_varied, y_varied = make_blobs(n_samples=n_samples,
                                cluster_std=[1.0, 2.5, 0.5],
                                random_state=random_state)
y_pred = KMeans(n_clusters=3, random_state=random_state).fit_predict(X_varied)

plt.subplot(223)
plt.scatter(X_varied[:, 0], X_varied[:, 1], c=y_pred)
plt.title("Unequal Variance")

# Unevenly sized blobs
X_filtered = np.vstack((X[y == 0][:500], X[y == 1][:100], X[y == 2][:10]))
y_pred = KMeans(n_clusters=3,
                random_state=random_state).fit_predict(X_filtered)

plt.subplot(224)
plt.scatter(X_filtered[:, 0], X_filtered[:, 1], c=y_pred)
plt.title("Unevenly Sized Blobs")

plt.show()

Demonstration of k-means assumptions.png

K-mean 算法應用

數據源

20newsgroups是將近20000篇新聞文檔集合蒂破，內容涉及20領域馏谨，詳細信息請參考20 Newsgroups，sklearn.datasets模塊就含有20newsgroups數據集附迷。

代碼分析

import sklearn.datasets
import scipy as sp

new_post = \
    """Disk drive problems. Hi, I have a problem with my hard disk.
After 1 year it is working only sporadically now.
I tried to format it, but now it doesn't boot any more.
Any ideas? Thanks.
"""

print("""\
Dear reader of the 1st edition of 'Building Machine Learning Systems with Python'!
For the 2nd edition we introduced a couple of changes that will result into
results that differ from the results in the 1st edition.
E.g. we now fully rely on scikit's fetch_20newsgroups() instead of requiring
you to download the data manually from MLCOMP.
If you have any questions, please ask at http://www.twotoreal.com
""")

all_data = sklearn.datasets.fetch_20newsgroups(subset="all")
 //總文檔數
print("Number of total posts: %i" % len(all_data.filenames))
# Number of total posts: 18846

groups = [
    'comp.graphics', 'comp.os.ms-windows.misc', 'comp.sys.ibm.pc.hardware',
    'comp.sys.mac.hardware', 'comp.windows.x', 'sci.space']
train_data = sklearn.datasets.fetch_20newsgroups(subset="train",
                                                 categories=groups)
 //Training文檔數
print("Number of training posts in tech groups:", len(train_data.filenames))

Dear reader of the 1st edition of 'Building Machine Learning Systems with Python'!
For the 2nd edition we introduced a couple of changes that will result into
results that differ from the results in the 1st edition.
E.g. we now fully rely on scikit's fetch_20newsgroups() instead of requiring
you to download the data manually from MLCOMP.
If you have any questions, please ask at http://www.twotoreal.com
Number of total posts: 18846
Number of training posts in tech groups: 3529

//#執(zhí)行Preprocessing—similarity measured as similar number of common words 步驟
labels = train_data.target
num_clusters = 50  # sp.unique(labels).shape[0]

import nltk.stem
english_stemmer = nltk.stem.SnowballStemmer('english')

from sklearn.feature_extraction.text import TfidfVectorizer


class StemmedTfidfVectorizer(TfidfVectorizer):

    def build_analyzer(self):
        analyzer = super(TfidfVectorizer, self).build_analyzer()
        return lambda doc: (english_stemmer.stem(w) for w in analyzer(doc))

vectorizer = StemmedTfidfVectorizer(min_df=10, max_df=0.5,
                                    stop_words='english', decode_error='ignore'
                                    )

vectorized = vectorizer.fit_transform(train_data.data)
num_samples, num_features = vectorized.shape
print("#samples: %d, #features: %d" % (num_samples, num_features))

// K-mean算法執(zhí)行
from sklearn.cluster import KMeans
km = KMeans(n_clusters=num_clusters, n_init=1, verbose=1, random_state=3)
clustered = km.fit(vectorized)

print("km.labels_=%s" % km.labels_)
# km.labels_=[ 6 34 22 ...,  2 21 26]

print("km.labels_.shape=%s" % km.labels_.shape)

// K-mean聚類算法結果評估
from sklearn import metrics
print("Homogeneity: %0.3f" % metrics.homogeneity_score(labels, km.labels_))
# Homogeneity: 0.400
print("Completeness: %0.3f" % metrics.completeness_score(labels, km.labels_))
# Completeness: 0.206
print("V-measure: %0.3f" % metrics.v_measure_score(labels, km.labels_))
# V-measure: 0.272
print("Adjusted Rand Index: %0.3f" %
      metrics.adjusted_rand_score(labels, km.labels_))
# Adjusted Rand Index: 0.064
print("Adjusted Mutual Information: %0.3f" %
      metrics.adjusted_mutual_info_score(labels, km.labels_))
# Adjusted Mutual Information: 0.197
print(("Silhouette Coefficient: %0.3f" %
       metrics.silhouette_score(vectorized, labels, sample_size=1000)))

Initialization complete
Iteration 0, inertia 5686.053
Iteration 1, inertia 3164.888
Iteration 2, inertia 3132.208
Iteration 3, inertia 3111.713
Iteration 4, inertia 3098.584
Iteration 5, inertia 3092.191
Iteration 6, inertia 3087.277
Iteration 7, inertia 3084.100
Iteration 8, inertia 3082.800
Iteration 9, inertia 3082.234
Iteration 10, inertia 3081.949
Iteration 11, inertia 3081.843
Iteration 12, inertia 3081.791
Iteration 13, inertia 3081.752
Iteration 14, inertia 3081.660
Iteration 15, inertia 3081.617
Iteration 16, inertia 3081.589
Iteration 17, inertia 3081.571
Converged at iteration 17: center shift 0.000000e+00 within tolerance 2.069005e-08
km.labels_=[48 23 31 ... 6 2 22]
km.labels_.shape=3529

//找出同類中與new_post最相近的post
new_post_vec = vectorizer.transform([new_post])
new_post_label = km.predict(new_post_vec)[0]

similar_indices = (km.labels_ == new_post_label).nonzero()[0]

similar = []
for i in similar_indices:
    dist = sp.linalg.norm((new_post_vec - vectorized[i]).toarray())
    similar.append((dist, train_data.data[i]))

similar = sorted(similar)
print("Count similar: %i" % len(similar))   //打印同類中有多少相近post

show_at_1 = similar[0]
show_at_2 = similar[int(len(similar) / 10)]
show_at_3 = similar[int(len(similar) / 2)]

print("=== #1 ===")
print(show_at_1)
print()

print("=== #2 ===")
print(show_at_2)
print()

print("=== #3 ===")
print(show_at_3)

Count similar: 56
=== #1 ===
(1.0378441731334074, "From: Thomas Dachsel GERTHD@mvs.sas.com\nSubject: BOOT PROBLEM with IDE controller\nNntp-Posting-Host: sdcmvs.mvs.sas.com\nOrganization: SAS Institute Inc.\nLines: 25\n\nHi,\nI've got a Multi I/O card (IDE controller + serial/parallel\ninterface) and two floppy drives (5 1/4, 3 1/2) and a\nQuantum ProDrive 80AT connected to it.\nI was able to format the hard disk, but I could not boot from\nit. I can boot from drive A: (which disk drive does not matter)\nbut if I remove the disk from drive A and press the reset switch,\nthe LED of drive A: continues to glow, and the hard disk is\nnot accessed at all.\nI guess this must be a problem of either the Multi I/o card\nor floppy disk drive settings (jumper configuration?)\nDoes someone have any hint what could be the reason for it.\nPlease reply by email to GERTHD@MVS.SAS.COM\nThanks,\nThomas\n+-------------------------------------------------------------------+\n| Thomas Dachsel |\n| Internet: GERTHD@MVS.SAS.COM |\n| Fidonet: Thomas_Dachsel@camel.fido.de (2:247/40) |\n| Subnet: dachsel@rnivh.rni.sub.org (UUCP in Germany, now active) |\n| Phone: +49 6221 4150 (work), +49 6203 12274 (home) |\n| Fax: +49 6221 415101 |\n| Snail: SAS Institute GmbH, P.O.Box 105307, D-W-6900 Heidelberg |\n| Tagline: One bad sector can ruin a whole day... |\n+-------------------------------------------------------------------+\n")
=== #2 ===
(1.1503043264096682, 'From: rpao@mts.mivj.ca.us (Roger C. Pao)\nSubject: Re: Booting from B drive\nOrganization: MicroTech Software\nLines: 34\n\nglang@slee01.srl.ford.com (Gordon Lang) writes:\n\n>David Weisberger (djweisbe@unix.amherst.edu) wrote:\n>: I have a 5 1/4" drive as drive A. How can I make the system boot from\n>: my 3 1/2" B drive? (Optimally, the computer would be able to boot\n>: from either A or B, checking them in order for a bootable disk. But\n>: if I have to switch cables around and simply switch the drives so that\n>: it can't boot 5 1/4" disks, that's OK. Also, boot_b won't do the trick\n>: for me.)\n>: \n>: Thanks,\n>: Davebo\n>We had the same issue plague us for months on our Gateway. I finally\n>got tired of it so I permanently interchanged the drives. The only\n>reason I didn't do it in the first place was because I had several\n>bootable 5-1/4's and some 5-1/4 based install disks which expected\n>the A drive. I order all new software (and upgrades) to be 3-1/2 and\n>the number of "stupid" install programs that can't handle an alternate\n>drive are declining with time - the ones I had are now upgraded. And\n>as for the bootable 5-1/4's I just cut 3-1/2 replacements.\n\n>If switching the drives is not an option, you might be able to wire up\n>a drive switch to your computer chasis. I haven't tried it but I think\n>it would work as long as it is wired carefully.\n\nI did this. I use a relay (Radio Shack 4PDT) instead of a huge\nswitch. This way, if the relay breaks, my drives will still work.\n\nIt works fine, but you may still need to change the CMOS before the\ndrive switch will work correctly for some programs.\n\nrp93\n-- \nRoger C. Pao {gordius,bagdad}!mts!rpao, rpao@mts.mivj.ca.us\n')
=== #3 ===
(1.2793959084781283, 'From: vg@volkmar.Stollmann.DE (Volkmar Grote)\nSubject: IBM PS/1 vs TEAC FD\nDistribution: world\nOrganization: Me? Organized?\nLines: 21\n\nHello,\n\nI already tried our national news group without success.\n\nI tried to replace a friend's original IBM floppy disk in his PS/1-PC\nwith a normal TEAC drive.\nI already identified the power supply on pins 3 (5V) and 6 (12V), shorted\npin 6 (5.25"/3.5" switch) and inserted pullup resistors (2K2) on pins\n8, 26, 28, 30, and 34.\nThe computer doesn't complain about a missing FD, but the FD's light\nstays on all the time. The drive spins up o.k. when I insert a disk,\nbut I can't access it.\nThe TEAC works fine in a normal PC.\n\nAre there any points I missed?\n\nThank you.\n\tVolkmar\n\n---\nVolkmar.Grote@Stollmann.DE\n')

?著作權歸作者所有,轉載或內容合作請聯系作者

人面猴
序言：七十年代末惧互，一起剝皮案震驚了整個濱河市哎媚，隨后出現的幾起案子，更是在濱河造成了極大的恐慌喊儡，老刑警劉巖拨与，帶你破解...
沈念sama閱讀 217,907評論 6贊 506
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現場離奇詭異艾猜，居然都是意外死亡买喧，警方通過查閱死者的電腦和手機，發(fā)現死者居然都...
沈念sama閱讀 92,987評論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門箩朴，熙熙樓的掌柜王于貴愁眉苦臉地迎上來岗喉，“玉大人，你說我怎么就攤上這事炸庞∏玻” “怎么了？”我有些...
開封第一講書人閱讀 164,298評論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵埠居，是天一觀的道長查牌。經常有香客問我，道長滥壕，這世上最難降的妖魔是什么纸颜？我笑而不...
開封第一講書人閱讀 58,586評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮绎橘，結果婚禮上胁孙，老公的妹妹穿的比我還像新娘。我一直安慰自己称鳞，他們只是感情好涮较，可當我...
茶點故事閱讀 67,633評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著冈止，像睡著了一般狂票。火紅的嫁衣襯著肌膚如雪熙暴。梳的紋絲不亂的頭發(fā)上闺属，一...
開封第一講書人閱讀 51,488評論 1贊 302
城市分裂傳說
那天，我揣著相機與錄音周霉，去河邊找鬼掂器。笑死，一個胖子當著我的面吹牛诗眨，可吹牛的內容都是我干的唉匾。我是一名探鬼主播，決...
沈念sama閱讀 40,275評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼巍膘！你這毒婦竟也來了厂财？” 一聲冷哼從身側響起，我...
開封第一講書人閱讀 39,176評論 0贊 276
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤峡懈，失蹤者是張志新（化名）和其女友劉穎璃饱，沒想到半個月后，有當地人在樹林里發(fā)現了一具尸體肪康，經...
沈念sama閱讀 45,619評論 1贊 314
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡荚恶，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內容為張勛視角年9月15日...
茶點故事閱讀 37,819評論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現自己被綠了磷支。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片谒撼。...
茶點故事閱讀 39,932評論 1贊 348
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖雾狈，靈堂內的尸體忽然破棺而出廓潜，到底是詐尸還是另有隱情，我是刑警寧澤辩蛋，帶...
沈念sama閱讀 35,655評論 5贊 346
?日本核電站爆炸內幕
正文年R本政府宣布，位于F島的核電站移盆，受9級特大地震影響悼院，放射性物質發(fā)生泄漏。R本人自食惡果不足惜咒循，卻給世界環(huán)境...
茶點故事閱讀 41,265評論 3贊 329
男人毒藥：我在死后第九天來索命
文/蒙蒙一据途、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧叙甸，春花似錦昨凡、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,871評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽蚂四。三九已至光戈，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間遂赠，已是汗流浹背久妆。一陣腳步聲響...
開封第一講書人閱讀 32,994評論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留跷睦，地道東北人筷弦。一個月前我還...
沈念sama閱讀 48,095評論 3贊 370
代替公主和親
正文我出身青樓，卻偏偏與公主長得像，于是被迫代替她去往敵國和親烂琴。傳聞我的和親對象是個殘疾皇子爹殊，可洞房花燭夜當晚...
茶點故事閱讀 44,884評論 2贊 354