A Break-down of the Business English Corpus

 

The corpus was created on the principle of noting the difference between test types used for doing business and those for ‘talking about’ business. This is reflected in the categorisations below.

 

WRITING ABOUT BUSINESS

 

PART OF CORPUS

TOKENS

CONTENTS

BUSINESS BOOKS

53,470

5 extracts from different books

(approx. 10,000 words each)[1]

BUSINESS NEWSPAPERS

64,291

121 articles

BUSINESS JOURNALS & MAGAZINES

78,846

52 articles

TOTAL

196, 607

 

 

 

WRITING TO DO BUSINESS

 

PART OF CORPUS

TOKENS

CONTENTS

ANNUAL REPORTS

34,537

3 annual reports

BUS PRESS RELEASES

21,656

29 business press releases

BUSINESS CONTRACTS

29, 602

13 contracts/agreements

BUSINESS FAXES

23,105

114 faxes

BUSINESS LETTERS

26,793

94 letters

BUSINESS REPORTS

62,908

17 reports

COMPANY BROCHURES

23,239

13 company brochures

EMAILS

28,857

202 emails

JOB ADVERTISEMENTS

22,293

87 job advertisements

MANUALS

21,160

5 manuals

MEMOS

12,542

47 memos

MINUTES

34,805

15 sets of minutes

PRODUCT BROCHURES

26,175

19 product brochures

QUOTATIONS

8,997

21 quotations

MISCELLANEOUS

2,427

OHT, job description & agendas

TOTAL

379, 096

 

 

 

 

TALKING ABOUT BUSINESS

 

PART OF CORPUS

TOKENS

CONTENTS

INTERVIEWS

70,894

24 interviews

BUSINESS ON RADIO & TV

148,983

72 broadcasts

TOTAL

219, 877

 

 

 

 

 

SPEAKING TO DO BUSINESS

 

PART OF CORPUS

TOKENS

CONTENTS

JOB INTERVIEWS:

3 JOB INTERVIEWS

3 JOB ASSESSMENT INTERVIEWS

 

17,447

6 interviews

MEETINGS:

 

4 MEETINGS

5 MEETINGS

1 MEETING

4 MEETINGS

1 MEETING

 

TOTAL WORDS:

 

 

 

 

 

 

 

 

126, 243

15 MEETINGS

 

technical

financial

orientation/planning

sales and marketing

presentation of new products

 

NEGOTIATING

16, 450

4 negotiation sessions

TELEPHONE CALLS

30, 414

89 phone conversations

SPEECHES/

PRESENTATIONS

19,020

5 speeches

TRAINING SESSION

17, 867

1 session of technical training

TOTAL

227,441

 

 

 



[1] This is the only section of the corpus where short samples were used.