参考网址:
https://www.pythoncentral.io/python-generators-and-yield-keyword/
https://stackoverflow.com/questions/519633/lazy-method-for-reading-big-file-in-python

1,yield就好比是生成迭代器的关键字,迭代器在需要的时候才会去拿数据,所以,第二段代码会报错,对于第一段代码:list本身就是全部放到内存中的。

 the_list = [2**x for x in range(5)]
 print(type(the_list))
 print(len(the_list))

the_genarator = (x+x for x in range(3))
print(the_genarator)
print(len(the_genarator)) # 这里会报错

2,实现 需要数据的时候再去读取,不需要的时候就不去读取数据。

def search(keyword,filename):
    print("generator started")
    f = open(filename,'r')

    for line in f:
        if keyword in line:
            yield line

    f.close()

the_genarator = search('search','c:/tmp.txt')

print(type(the_genarator))
print(type(search))

print(next(the_genarator))  # 这行代码可以多运行几次,直到tmp.txt中找到的search全部运行完

3,实例三:

def hold_client(name):
    yield 'Hello, %s! You will be connected soon' % name
    yield 'Dear %s, could you please wait a bit.' % name
    yield 'Sorry %s, we will play a nice music for you!' % name
    yield '%s, your call is extremely important to us!' % name

tmp = hold_client("xiaoming")


print(next(tmp)) # 这里可以把hold_client 中所有yeild的字符 ,全部打印
print(next(tmp))
print(next(tmp))
print(next(tmp))

4,实例四:

#  通过yield实现的斐波拉切数列
def fibonacci(n):
    curr = 1
    prev = 0
    counter = 0
    while counter < n:
        yield curr
        prev, curr = curr, prev + curr
        counter += 1

# In[]:

fi = fibonacci(5)

print(next(fi))
print(next(fi))
print(next(fi))
print(next(fi))

5,实例五:

# yeild 的关键字来读取大文件
# 两种形式来实现:第一种是普通的read来实现,另外一种是pandas来实现

def read_in_chunks(file,chunk_size = 1024):
        #     懒惰生成器,默认大小是1k
    while True:
        data = file.read(chunk_size)
        if not data:
            break
        yield data

f=open('filePath')


da = read_in_chunks('filepath')
next(da)

#对于大文件读取的方式是一样的。
# 通过yeild关键字来分段的读取数据
def read_in_chunks(filepath,chunksize = 102400):
    f = open(filepath,'r')
    while True:
        # 这种读文件的方法是,读了chunksize大小的文件后,指针下移到末尾
        data = f.read(chunksize)
        if not data:
            break
        yield data

sreader = read_in_chunks('E:/tmp/tmp.csv')

6,扩展:对于大文件,可以使用mmap来操作。
这只是一个简单的例子,详细的继续学习。

import mmap

with open('E:/tmp/tmp.csv','r+') as f:
    map = mmap.mmap(f.fileno(),0)

    print(map.readline())

    map.seek(0)

    print(map.readline())

    map.close()
Logo

CSDN联合极客时间,共同打造面向开发者的精品内容学习社区,助力成长!

更多推荐