Цитата |
---|
Sanja написал: С чего это вдруг ‘недостаток’? По условию задачи этого не требовадось |
Я имела ввиду, что лучше сделать, чтобы можно было сохранять более одного раза без лишних действий)
Sub SafeAsPdf()
Dim arrSelSheets(), i As Long
Application.ScreenUpdating = False
ReDim arrSelSheets(1 To ActiveWindow.SelectedSheets.Count)
For i = 1 To UBound(arrSelSheets)
arrSelSheets(i) = ActiveWindow.SelectedSheets(i).Name
Next
Worksheets(Array(«КП»)).Select
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _
ThisWorkbook.Path & «» & «КП » & Format(Now, «YYYYMMDD») & » » & Format(Now, «hhmm») & «.pdf», Quality:=xlQualityStandard, _
IncludeDocProperties:=True, IgnorePrintAreas:=False, OpenAfterPublish:=False
Worksheets(arrSelSheets).Select
Application.ScreenUpdating = True
MsgBox «Урррааа!», vbInformation
End Sub
Сохранение в PDF книги Excel, группы листов, одного листа или отдельного диапазона с помощью кода VBA. Метод ExportAsFixedFormat. Примеры экспорта.
Метод ExportAsFixedFormat
Метод ExportAsFixedFormat сохраняет рабочую книгу Excel или выбранную группу листов этой книги в один PDF-файл. Чтобы экспортировать каждый лист в отдельный файл, необходимо применить метод ExportAsFixedFormat к каждому сохраняемому листу.
Синтаксис
Expression.ExportAsFixedFormat (Type, FileName, Quality, IncludeDocProperties, IgnorePrintAreas, From, To, OpenAfterPublish, FixedFormatExtClassPtr) |
Expression – это выражение, представляющее объект Workbook, Worksheet или Range.
Параметры
Единственный обязательный параметр – Type, остальные можно не указывать – в этом случае будут применены значения по умолчанию.
Параметр | Описание |
---|---|
Type | Задает формат файла для экспорта книги или листа: xlTypePDF(0) – сохранение в файл PDF; xlTypeXPS(1) – сохранение в файл XPS*. |
FileName | Задает имя файла. При указании полного пути, файл будет сохранен в указанную папку, при указании только имени – в папку по умолчанию (в Excel 2016 – «Документы»). Если имя не задано (по умолчанию), файл будет сохранен с именем экспортируемой книги. |
Quality | Задает качество сохраняемых электронных таблиц: xlQualityMinimum(1) – минимальное качество; xlQualityStandard(0) – стандартное качество (по умолчанию). |
IncludeDocProperties | Включение свойств документа Excel в PDF: True(1) – включить; False(0) – не включать; мне не удалось обнаружить разницу и значение по умолчанию. |
IgnorePrintAreas | Указывает VBA, следует ли игнорировать области печати, заданные на листах файла Excel: True(1) – игнорировать области печати; False(0) – не игнорировать области печати (по умолчанию). |
From** | Задает номер листа книги Excel, с которого начинается экспорт. По умолчанию сохранение в PDF начинается с первого листа книги. |
To** | Задает номер листа книги Excel, на котором заканчивается экспорт. По умолчанию сохранение в PDF заканчивается на последнем листе книги. |
OpenAfterPublish | Указывает VBA на необходимость открыть созданный файл PDF средством просмотра: True(1) – открыть файл PDF для просмотра; False(0) – не открывать файл PDF для просмотра (по умолчанию). |
FixedFormatExtClassPtr | Указатель на класс FixedFormatExt (игнорируем этот параметр). |
* XPS – это редко использующийся фиксированный формат файлов, разработанный Microsoft, который похож на PDF, но основан на языке XML.
** Применимо только к книге (Workbook.ExportAsFixedFormat), при экспорте листа (Worksheet.ExportAsFixedFormat) указание параметров From и/или To приведет к ошибке.
Сохранение в PDF книги Excel
Экспорт всей книги
Sub Primer1() ThisWorkbook.ExportAsFixedFormat Type:=xlTypePDF, Filename:=«C:Testfile1.pdf», OpenAfterPublish:=True End Sub |
Если вы указываете путь к файлу, он должен существовать, иначе VBA сохранит файл с именем и в папку по умолчанию («ИмяКниги.pdf» в папку «Документы»).
Экспорт части книги
Этот способ позволяет сохранить в PDF группу листов, расположенных подряд:
Sub Primer2() ThisWorkbook.ExportAsFixedFormat Type:=xlTypePDF, Filename:=«C:Testfile2.pdf», _ From:=2, To:=4, OpenAfterPublish:=True End Sub |
Сохранение в PDF рабочих листов
Экспорт одного листа
Sub Primer3() ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:=«C:Testfile3.pdf», OpenAfterPublish:=True End Sub |
Экспорт диапазона
Sub Primer2() Лист4.Range(«A1:F6»).ExportAsFixedFormat Type:=xlTypePDF, Filename:=«C:Testfile4.pdf», OpenAfterPublish:=True End Sub |
Если экспортируемый диапазон расположен на активном листе, имя листа указывать не обязательно.
Экспорт группы листов
Этот способ удобен тем, что экспортировать в PDF можно листы, расположенные не подряд:
Sub Primer5() Sheets(Array(«Лист2», «Лист3», «Лист5»)).Select Selection.ExportAsFixedFormat Type:=xlTypePDF, Filename:=«C:Testfile5.pdf», OpenAfterPublish:=True End Sub |
Jun 01 2017
01:26 AM
— edited
Jun 01 2017
08:21 AM
Further to my previous post I have managed to adapt the code to do what I required: 1. Create a pdf of ActiveSheet 2. Save pdf into a specific location 3. Name the pdf with a file name based on the sheet date 4. Open an email, with selected recipients, and attach the current pdf to and email If you have any suggestions to develop the code or see any possible errors please let me know. Sub Sent_Email() Dim olApp As Object s = Range(«h1»).Value ‘ ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, FileName:= _ s, Quality:=xlQualityStandard, IncludeDocProperties _ :=True, IgnorePrintAreas:=False, OpenAfterPublish:=False PDF_File = «I:2017 — 2018Operations UnitDay SheetsDS_» & Format(Now, «YYMMDD») & «.pdf» Set olApp = CreateObject(«Outlook.Application») With olApp.CreateItem(0) .Subject = «Daily Resource Sheet» .To = ActiveSheet.Range(«C50») .Cc = ActiveSheet.Range(«C55») .Body = «Hi,» & vbLf & vbLf _ & «Please find attached the Daily Resource Sheet.» & vbLf & vbLf _ & «Regards,» & vbLf & vbLf _ & «Roads Operations Unit» .Attachments.Add PDF_File .Save .Display End With End Sub
best response confirmed by
Joe Gray (New Contributor)
Jun 01 2017
06:31 PM
— edited
Jun 01 2017
06:38 PM
Solution
This is code I use.
End Sub
Private Sub Email() Dim objOutlook As Object Dim objMail As Object Dim signature As String Dim oWB As Workbook Set oWB = ActiveWorkbook Set objOutlook = CreateObject("Outlook.Application") Set objMail = objOutlook.CreateItem(0) With objMail .Display End With signature = objMail.HTMLbody With objMail .To = oWB.Sheets("Sheet1").Range("A1").Value ''.SentOnBehalfOfName = "" .Subject = strMySubject ''.body = "Dear Sir," & vbNewLine & vbNewLine & "Add stuff here" & vbNewLine & vbNewLine & signature .HTMLbody = "<font face=" & Chr(34) & "Calibri" & Chr(34) & " size=" & Chr(34) & 4 & Chr(34) & ">" & "Dear Sir," & "<br> <br>" & "Add stuff here" & "<br> <br>" & signature & "</font>" .Attachments.Add (strSaveFileName + ".pdf") .Display End With Set objOutlook = Nothing Set objMail = Nothing End Sub
Jun 01 2017
06:37 PM
— edited
Jun 01 2017
06:37 PM
And this is code to export to PDF
You obviously need a full path as a string to save the file to
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:=strSaveFileName, Quality:=xlQualityStandard, IncludeDocProperties:=False, IgnorePrintAreas:=False, OpenAfterPublish:=False ThisWorkbook.SaveAs Filename:=strSaveFileName, FileFormat:=xlOpenXMLTemplateMacroEnabled
Jun 02 2017
01:59 AM
Thank you Andrew
You‘re no doubt aware that I am new to Coding and very much appreciate your response. I will hopefully be able to pass on assistance to others in the very near future.
Jun 02 2017
07:33 AM
Andrew
Just to let you know that the code below works well for me, and thank you again.
Command Button: —
- Generates a pdf of the ActiveSheet
- Saves pdf to a specific location, indicated in Cell H1, with an auto Filename based on the Sheet Date within Cell I3, formatted as YYMMDD
- Opens Outlook and displays email with pdf attached (file with date «Now»)
- Auto fills Recipients, reading from Cell C50
- Email Body as required
When i’m happy i will change code to automatically send email not display.
for reference:-
Cell H1: =»I:2017 — 2018Operations UnitDay Sheets»&»DS_»&TEXT(I3,»yymmdd»)&».PDF»
Private Sub Email_Sheet_Click() Dim objOutlook As Object Dim objMail As Object Dim signature As String Dim oWB As Workbook Set oWB = ActiveWorkbook s = Range("h1").Value ' ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, FileName:= _ s, Quality:=xlQualityStandard, IncludeDocProperties _ :=True, IgnorePrintAreas:=False, OpenAfterPublish:=False PDF_File = "Insert Path hereDS_" & Format(Now, "YYMMDD") & ".pdf" Set objOutlook = CreateObject("Outlook.Application") Set objMail = objOutlook.CreateItem(0) With objMail .Display End With signature = objMail.HTMLbody With objMail .To = ActiveSheet.Range("C50") .Cc = ActiveSheet.Range("C55") .Subject = "Insert Subject Here" .HTMLbody = "<font face=" & Chr(34) & "Calibri" & Chr(34) & " size=" & Chr(34) & 4 & Chr(34) & ">" & "Hi," & "<br> <br>" & "Insert email body here" & "<br> <br>" & signature & "</font>" .Attachments.Add PDF_File .Save .Display End With Set objOutlook = Nothing Set objMail = Nothing End Sub
May 14 2018
08:28 AM
I’m very new to coding, so I wanted to add a little twist to this thread. Is it possible to have a code to convert the current spreadsheet to PDF, create an email through Outlook, but not save the document?
Thanks!
May 14 2018
12:47 PM
— edited
May 14 2018
12:47 PM
From what I remember it’s necessary to save the document to attach the file. However, you can save the PDF to a temporary directory and then once done using it you can just delete it:
It would look something like this:
TempFilePath = Environ$("temp") & "" 'This defines the filepath ---> C:UsersusernameAppDataLocalTemp TempFileName = "Your File Name Goes Here" 'Name File
ActiveWorkbook.SaveAs TempFilePath & TempFileName 'Save file in Temporary Directory 'Email Workbook to people With OutMail .To = "YourName@Email.Com" .CC = "" .BCC = "" .Subject = "This is the Subject line" .Body = "This is the email body" 'Use "Blah Blah Blah" & Chr(13) & "This is another line" .Attachments.Add TempFilePath & TempFileName .Send End With 'Delete the Temporary File Kill TempFilePath & TempFileName
May 23 2018
12:34 PM
Please help…
This code is perfect for what I want to do in order to simplify a procees repeated at work but am a little stuck. I have it working to a point..
PDF Is saving in the directory pre set in Cell A1 all ok.
Outlook then opens (no attachment and no email addresses inserted) and I then get the following error (See attached)
I do not need to save the doc with date. Happy to Modify Cell A1 (C:UsersmarkDesktopquotes12345.pdf each time its used with the prefered directory/filename. Clearly there is some of the code and how it works am not understanding.
I dont understand why the email addresses dont pull through from cells A2 and A3 and not sure what I need to do in order to have the file which has saved ok in the required directy attach to the email.
If anyone can help me get this working would be much appreciated also if any additional information is required just ask. Full VBA below….
Sub Email_Sheet_Click()
Dim objOutlook As Object
Dim objMail As Object
Dim signature As String
Dim oWB As Workbook
Set oWB = ActiveWorkbook
s = Range(«A1»).Value
‘
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _
s, Quality:=xlQualityStandard, IncludeDocProperties _
:=True, IgnorePrintAreas:=False, OpenAfterPublish:=False
PDF_File = «Insert Path hereDS_» & Format(Now, «YYMMDD») & «.pdf»
Set objOutlook = CreateObject(«Outlook.Application»)
Set objMail = objOutlook.CreateItem(0)
With objMail
.Display
End With
signature = objMail.HTMLbody
With objMail
.To = ActiveSheet.Range(«A2»)
.Cc = ActiveSheet.Range(«A3»)
.Subject = «Insert Subject Here»
.HTMLbody = «<font face=» & Chr(34) & «Calibri» & Chr(34) & » size=» & Chr(34) & 4 & Chr(34) & «>» & «Hi,» & «<br> <br>» & «Insert email body here» & «<br> <br>» & signature & «</font>»
.Attachments.Add PDF_File
.Save
.Display
End With
Set objOutlook = Nothing
Set objMail = Nothing
End Sub
Any help will be much appreciated!
May 23 2018
12:49 PM
These small edits should make the code work (Please see changes in bold):
Sub Email_Sheet_Click() Dim objOutlook As Object Dim objMail As Object Dim signature As String Dim PDF_FileName As String Dim oWB As Workbook Set oWB = ActiveWorkbook 'Change your string to this.... PDF_FileName 'or change your cell value in A1 'This is the name of your PDF... 'Change accordingly.... PDF_FileName = "C:Usersmmickle1DesktopDS_" & Format(Now, "YYMMDD") & ".pdf" ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _ PDF_FileName, Quality:=xlQualityStandard, IncludeDocProperties _ :=True, IgnorePrintAreas:=False, OpenAfterPublish:=False Set objOutlook = CreateObject("Outlook.Application") Set objMail = objOutlook.CreateItem(0) With objMail .Display End With signature = objMail.HTMLbody With objMail .To = ActiveSheet.Range("A2") .Cc = ActiveSheet.Range("A3") .Subject = "Insert Subject Here" .HTMLbody = "<font face=" & Chr(34) & "Calibri" & Chr(34) & " size=" & Chr(34) & 4 & Chr(34) & ">" & "Hi," & "<br> <br>" & "Insert email body here" & "<br> <br>" & signature & "</font>" .Attachments.Add PDF_FileName 'Now that the name is correct it will work. .Save .Display End With Set objOutlook = Nothing Set objMail = Nothing End Sub
If you have additional questions it may be beneficial to attach a sample file for testing.
Jun 07 2018
12:36 PM
— edited
Jun 07 2018
01:37 PM
Thanks Matt sorted now appreciate the quick response. For anyone attempting a similar code I have also added a few things as wanted email subjectfield to display a variable cell value from an INDEX/lookup and also wanted the body of the email to include a variable cell value (I am too lazy to actually type the information in when the email opens!)
Sub emailsavePDF()Dim objOutlook As Object
Dim objMail As Object
Dim signature As String
Dim oWB As Workbook
Set oWB = ActiveWorkbooks = Range("O7").Value
'
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _
s, Quality:=xlQualityStandard, IncludeDocProperties _
:=True, IgnorePrintAreas:=False, OpenAfterPublish:=FalsePDF_File = Range("O7").Value & ".pdf"
Set objOutlook = CreateObject("Outlook.Application")
Set objMail = objOutlook.CreateItem(0)With objMail
.display
End With
signature = objMail.HTMLbody
With objMail
.To = Sheets("Estimate").Range("O9")
.Cc = Sheets("Estimate").Range("O10")
.Subject = Range("N23").Value
.HTMLbody = "<BODY style=font-size:11pt;font-family:Calibri>Hi;<p>Please find attached estimate for trailer " & Range("N24") & "<p> Any questions please don't hesitate to ask." & "<br> <br>" & signature & "</font>"
.Attachments.Add PDF_File
.Save
.display
End WithSet objOutlook = Nothing
Set objMail = Nothing
End Sub
Jun 08 2018
01:25 PM
Hey Mark-
Glad you were able to get it working! Please feel free to post back to the community if you have additional Excel questions.
Jun 27 2018
08:33 AM
— edited
Jun 27 2018
09:06 AM
Formula works great but I need to make an amendment and struggling to get it working correctly.
I want a error message to appear telling the user to check they have met certain conditions should they forget to complete certain cells which constitute the file name the doc is to be saved as.
There are x2 Macros that run together email and save PDF and save Excel doc both below.
I have inserted a goto On Error command but struggling getting it right. It either displays as soon as the Macros are run regardless of error existing or it displays x2 times for each macro and then (even if the cells are left blank) continues to save the excel doc with no file name which I did not think was possible.
I need the message to box to simply appear once and then end/do nothing after the user sees the message rather than save the excel doc with no filename.
Sub emailsavePDF()Dim objOutlook As Object
Dim objMail As Object
Dim signature As String
Dim oWB As Workbook
Set oWB = ActiveWorkbook
On Error GoTo ErrMsg
s = Range("O7").Value
'
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _
s, Quality:=xlQualityStandard, IncludeDocProperties _
:=True, IgnorePrintAreas:=False, OpenAfterPublish:=False
PDF_File = Range("O7").Value & ".pdf"
Set objOutlook = CreateObject("Outlook.Application")
Set objMail = objOutlook.CreateItem(0)
With objMail
.display
End With
signature = objMail.HTMLbody
With objMail
.To = Sheets("Estimate").Range("O9")
.Cc = Sheets("Estimate").Range("O10")
.Subject = Range("O12").Value
.HTMLbody = "<BODY style=font-size:11pt;font-family:Calibri>Hi;<p>Please find attached estimate for trailer " & Range("O13") & "<p> Any questions please don't hesitate to ask." & "<br> <br>" & signature & "</font>"
.Attachments.Add PDF_File
.Save
.display
Exit Sub
ErrMsg:
MsgBox "1: A customer must been selected in cell C4" & vbNewLine & "" & vbNewLine & "2: A trailer number must be entered in cell C5 and must not contain any symbols" & vbNewLine & "" & vbNewLine & "3: A breif repair description must be entered in Cell C9 and must not contain any symbols" & vbNewLine & "" & vbNewLine & "4: You are connected to the network", , "THE FOLLOWING STEPS MUST BE COMPLETED"
Exit Sub
End With
Set objOutlook = Nothing
Set objMail = Nothing
End Sub
Sub emailsaveexcel()
Dim newWB As Variant
Dim wb1 As Workbook
Set wb1 = ActiveWorkbook
With wb1
.SaveCopyAs Sheets("Estimate").Range("O5").Text & ".xlsm"
Exit Sub
End With
End Sub
Jun 27 2018
04:17 PM
If your filename only consists of range O7 then you will want to make sure that it is not left blank. If this is the only scenario where you’re getting an error you may not need any complex error handling. YOu could probably get by with something like this:
If Range("O7") = vbNullString Then MsgBox "Please fill in cell O7 which contains the filename. " _ & "It has been left blank", vbCritical, "Error Message" Exit Sub End If
Jun 27 2018
04:19 PM
— edited
Jun 27 2018
11:51 PM
Hi Mark,
I’m glad to see this code has been helpful and is still being developed.
1. you are not using the error function correctly.
2. your PDF_FILE variable does not reference a sheet
3. your s variable does not reference a sheet and is a duplicate of PDF_FILE
Example
Public wb As Workbook
Public sh As Worksheet
Private Sub Test()
On Error GoTo errormessage
Set wb = ActiveWorkbook
Set sh = ActiveSheet
With sh
If .Range(«C4») = Empty Then
MsgBox («please enter values in C4»)
End
ElseIf .Range(«C5») = Empty Then
MsgBox («please enter values in C5»)
End
ElseIf .Range(«C9») = Empty Then
MsgBox («please enter values in C9»)
End
End If
End With
Dim Filename As String
Filename = «C:UsersPublicDocuments» & sh.Range(«O5»).Value & «.xlsm»
wb.SaveAs Filename:=Filename, FileFormat:=xlOpenXMLWorkbookMacroEnabled
errormessage:
With Err
Select Case .Number
Case 1004 ‘add code for 1004 error
MsgBox «Error: » & Err.Number & vbNewLine & vbNewLine & Err.Description, vbOKOnly + vbExclamation, «Error»
Case Else
MsgBox «Error: » & Err.Number & vbNewLine & vbNewLine & Err.Description, vbOKOnly + vbExclamation, «Error»
End Select
Resume Next
End With
EndOfSub:
End Sub
Jun 28 2018
12:23 AM
— edited
Jun 28 2018
01:16 AM
Hi Matt/Andrew
O7 Contains a formula something like =O6&C5&» «&C9 so the filename/directory is the combined contents of cells c5 (Unit number) and c9 (repair description) the directory the file is being saved in (Which varies on a drop down placed in cell C4) must also be selected or the formula will not know where to save it. The directory varies depending on customer the spreadsheet is being used for.
Just the elaborate the Macro works perfectly for me I just need a pop up message if cells 06 C6 and C9 are left blank as this will cause an error as the Macro will not know where to save if C4 is blank (Technically the user only needs to put text in one box for the file name O6 or C5) an error will only occur if both are blank. The only other time an error would occur is if the user is not connected to our network as all of the directory’s are on our network drive. The user does not see or edit Cell O7
I need an error handling code reminding the user to fill in these cells if they forget rather than the standard Excel message which doesn’t really help the user. It may be possible to have a Macro telling the user to fill in these cells but an error will still occur if they are not connected to our network/VPN so error message will still occur.
Code without my attempt at error goto below:
Sub emailsavePDF()
Dim objOutlook As Object
Dim objMail As Object
Dim signature As String
Dim oWB As Workbook
Set oWB = ActiveWorkbook
s = Range(«O7»).Value
‘
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _
s, Quality:=xlQualityStandard, IncludeDocProperties _
:=True, IgnorePrintAreas:=False, OpenAfterPublish:=False
PDF_File = Range(«O7»).Value & «.pdf»
Set objOutlook = CreateObject(«Outlook.Application»)
Set objMail = objOutlook.CreateItem(0)
With objMail
.display
End With
signature = objMail.HTMLbody
With objMail
.To = Sheets(«Estimate»).Range(«O9»)
.Cc = Sheets(«Estimate»).Range(«O10»)
.Subject = Range(«O12»).Value
.HTMLbody = «<BODY style=font-size:11pt;font-family:Calibri>Hi;<p>Please find attached estimate for trailer » & Range(«O13») & «<p> Any questions please don’t hesitate to ask.» & «<br> <br>» & signature & «</font>»
.Attachments.Add PDF_File
.Save
.display
End With
Set objOutlook = Nothing
Set objMail = Nothing
End Sub
Sub emailsaveexcel()
Dim newWB As Variant
Dim wb1 As Workbook
Set wb1 = ActiveWorkbook
With wb1
.SaveCopyAs Sheets(«Estimate»).Range(«O5»).Text & «.xlsm»
End With
End Sub
Jun 28 2018
05:19 AM
Mark-
Try something like this:
Sub WhateverSub() On Error GoTo ErrHandler 'Your Code Here.... Exit Sub ErrHandler: 'If one of these cells is blank.... If Range("C4") = vbNullString Or Range("C6") = vbNullString Or Range("C9") = vbNullString Then MsgBox "Please fill in cells O7, C6 and C9 they are mandatory fields that " _ & "have been left blank!!", vbCritical, "Error Message" Exit Sub Else 'Some other error like a VPN error MsgBox "Please make sure you have a network connection!", vbCritical, "Error Message" Exit Sub End If End Sub
You can check to see if a directory exists like this….
If Dir("C:Usersmmickle1Documents", vbDirectory) = "" Then MsgBox "Directory does not exist..." End If
You may consider having some data validation to light up a cell red if the mandatory fields are not filled in as an extra safe guard…
Here’s the original article I learned about error handling with… it’s pretty thorough. In fact I still reference it occasionally :
Chip Pearson Error Handling
Jun 29 2018
12:08 AM
You could use WScript.Network to connect to/ test for mapped drives? (the network)
https://ss64.com/vb/network.html
the error handler is very general and will display a message showing any code problems it catches. its up to you to decide what to do with the errors.
Sub emailsavePDF()on error Goto ErrHandler
Dim objOutlook As Object
Dim objMail As Object
Dim signature As String
Dim oWB As WorkbookSet oWB = ActiveWorkbook
' MsgBox "
'1: A customer must been selected in cell C4" & vbNewLine & "" & vbNewLine & "
'2: A trailer number must be entered in cell C5 and must not contain any symbols" & vbNewLine & "" & vbNewLine & "
'3: A breif repair description must be entered in Cell C9 and must not contain any symbols" & vbNewLine & "" & vbNewLine & "
'4: You are connected to the network", , "THE FOLLOWING STEPS MUST BE COMPLETED"With Sheets("Estimate")
If .Range("C4") = Empty Then
MsgBox ("A customer must been selected in cell C4 and must not contain any symbols")
End
ElseIf .Range("C5") = Empty Then
MsgBox ("A trailer number must be entered in cell C5 and must not contain any symbols")
End
ElseIf .Range("C9") = Empty Then
MsgBox ("A breif repair description must be entered in Cell C9 and must not contain any symbols")
End
End IfEnd With
s = Range("O7").Value 'excel file name
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:=s, Quality:=xlQualityStandard, IncludeDocProperties:=True, IgnorePrintAreas:=False, OpenAfterPublish:=False
PDF_File = s & ".pdf" 'pdf file name
Set objOutlook = CreateObject("Outlook.Application")
Set objMail = objOutlook.CreateItem(0)With objMail
.display
End With
signature = objMail.HTMLbody
With objMail
.To = Sheets("Estimate").Range("O9")
.Cc = Sheets("Estimate").Range("O10")
.Subject = Range("O12").Value
.HTMLbody = "<BODY style=font-size:11pt;font-family:Calibri>Hi;<p>Please find attached estimate for trailer " & Range("O13") & "<p> Any questions please don't hesitate to ask." & "<br> <br>" & signature & "</font>"
.Attachments.Add PDF_File
.Save
.displayEnd With
Set objOutlook = Nothing
Set objMail = Nothing
ErrHandler:
With ErrSelect Case .Number
Case 1004 'add code for 1004 error
MsgBox "Error: " & Err.Number & vbNewLine & vbNewLine & Err.Description, vbOKOnly + vbExclamation, "Error"
Case Else
MsgBox "Error: " & Err.Number & vbNewLine & vbNewLine & Err.Description, vbOKOnly + vbExclamation, "Error"
End Select
Resume Next
End With
End SubSub emailsaveexcel()
Dim newWB As Variant
Dim wb1 As Workbook
Set wb1 = ActiveWorkbook
With wb1
.SaveCopyAs Sheets("Estimate").Range("O5").Text & ".xlsm"
End With
End Sub
Jul 11 2018
11:12 AM
Thanks Andrew
The below works great thanks with one exception. Is there any way if the user enters any of the following / : * ? ” < > |anywhere in both cell C5 or C9 a warning appears as these cells form the filename and the file cannot be saved if these symbols are contained anywhere within the text of these cells
Sub emailsavePDF()Dim objOutlook As Object
Dim objMail As Object
Dim signature As String
Dim oWB As Workbook
Set oWB = ActiveWorkbook' MsgBox "
'1: A customer must been selected in cell C4" & vbNewLine & "" & vbNewLine & "
'2: A trailer number must be entered in cell C5 and must not contain any symbols" & vbNewLine & "" & vbNewLine & "
'3: A breif repair description must be entered in Cell C9 and must not contain any symbols" & vbNewLine & "" & vbNewLine & "
'4: You are connected to the network", , "THE FOLLOWING STEPS MUST BE COMPLETED"With Sheets("Estimate")
If .Range("C4") = Empty Then
MsgBox ("A customer must been selected in cell C4 and must not contain any symbols")
End
ElseIf .Range("C5") = Empty Then
MsgBox ("A trailer number must be entered in cell C5 and must not contain any symbols")
End
ElseIf .Range("C9") = Empty Then
MsgBox ("A breif repair description must be entered in Cell C9 and must not contain any symbols")
End
End IfEnd With
s = Range("O7").Value
'
ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, Filename:= _
s, Quality:=xlQualityStandard, IncludeDocProperties _
:=True, IgnorePrintAreas:=False, OpenAfterPublish:=FalsePDF_File = Range("O7").Value & ".pdf"
Set objOutlook = CreateObject("Outlook.Application")
Set objMail = objOutlook.CreateItem(0)With objMail
.display
End With
signature = objMail.HTMLbody
With objMail
.To = Sheets("Estimate").Range("O9")
.Cc = Sheets("Estimate").Range("O10")
.Subject = Range("O12").Value
.HTMLbody = "<BODY style=font-size:11pt;font-family:Calibri>Hi;<p>Please find attached estimate for trailer " & Range("O13") & "<p> Any questions please don't hesitate to ask." & "<br> <br>" & signature & "</font>"
.Attachments.Add PDF_File
.Save
.display
End WithSet objOutlook = Nothing
Set objMail = NothingEnd Sub
Sub emailsaveexcel()
Dim newWB As Variant
Dim wb1 As Workbook
Set wb1 = ActiveWorkbook
With wb1
.SaveCopyAs Sheets("Estimate").Range("O5").Text & ".xlsm"
End WithEnd Sub
ggg
Jul 11 2018
11:43 AM
Mark-
You can just error handle for that scenario by using the Replace() Function.
s = Replace(Replace(Replace(Range("O7"), "/", "_"), "", "_"), ":", "_")
This will replace those special characters with an «_» which is acceptable for file names…
макрос сохранения файла с другим расширением (pdf) |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
||||||||
Ответить |
In this article, you will learn how to create a PDF to Excel converter using Excel VBA. We have created a Setting worksheet wherein we are taking 2 folders inputs on range E11 and E12 respectively.
- PDF Files Folder: Wherein all the PDF files are available
- Excel File Folder: Wherein Excel files will be saved after conversion.
Add the Microsoft Scripting Runtime Reference in your VBA project then paste below given code in your module.
Below is the code to convert PDF to excel.
Option Explicit Sub PDF_To_Excel() Dim setting_sh As Worksheet Set setting_sh = ThisWorkbook.Sheets("Setting") Dim pdf_path As String Dim excel_path As String pdf_path = setting_sh.Range("E11").Value excel_path = setting_sh.Range("E12").Value Dim fso As New FileSystemObject Dim fo As Folder Dim f As File Set fo = fso.GetFolder(pdf_path) Dim wa As Object Dim doc As Object Dim wr As Object Set wa = CreateObject("word.application") 'Dim wa As New Word.Application wa.Visible = True 'Dim doc As Word.Document Dim nwb As Workbook Dim nsh As Worksheet 'Dim wr As Word.Range For Each f In fo.Files Set doc = wa.documents.Open(f.Path, False, Format:="PDF Files") Set wr = doc.Paragraphs(1).Range wr.WholeStory Set nwb = Workbooks.Add Set nsh = nwb.Sheets(1) wr.Copy nsh.Paste nwb.SaveAs (excel_path & "" & Replace(f.Name, ".pdf", ".xlsx")) doc.Close False nwb.Close False Next wa.Quit MsgBox "Done" End Sub
Click here to download the Practice file-
Watch the step by step video tutorial:
Meet PK, the founder of PK-AnExcelExpert.com! With over 15 years of experience in Data Visualization, Excel Automation, and dashboard creation. PK is a Microsoft Certified Professional who has a passion for all things in Excel. PK loves to explore new and innovative ways to use Excel and is always eager to share his knowledge with others. With an eye for detail and a commitment to excellence, PK has become a go-to expert in the world of Excel. Whether you’re looking to create stunning visualizations or streamline your workflow with automation, PK has the skills and expertise to help you succeed. Join the many satisfied clients who have benefited from PK’s services and see how he can take your Excel skills to the next level!
Добрый день уважаемые Форумчане!
Получилось частично сделать макрос, который выполняет сохранение нужного листа в данном случае(«Лист2») в PDF формате, при этом присваивая ему имя из заданной ячейки.
Но имеются 2 вопроса:
1. Почему-то если в ячейки стоит формула, а она мне нужна, по сколько я соединяю ей тек. дату и текст, макрос не вставляет ее значение в название файла автоматом, а если написать просто текст, то вставляет. Собственно в чём тут проблема пожалуйста подскажите.
2. Как указать нужную папку для сохранения(например C:UsersВладелецDesktopЕжедневный), если я правильно понял, то к ExportAsFixedFormat надо задать еще переменную OutputFileName, но правильных примеров не нашёл.
Спасибо.
[Some of the links in this Excel Tutorial are affiliate links, which means that if you choose to make a purchase, I will earn a commission. This commission comes at no additional cost to you.]
I’ve written about the topic of PDF and Excel several times throughout Power Spreadsheets. Some of the issues I’ve covered include the following:
- How to convert PDF files to Excel.
- How to save Excel files as PDF using VBA.
All of these topics are quite important. After all, PDF files are one of the most widely used formats for electronic documents.
If you’re anything like me, you probably encounter PDF files at work virtually every single day. In some of those cases, you may need to extract data from PDF to Excel. I cover the topic of manually converting PDF files to Excel in the first post I link to above. In that particular post I explain 3 different methods you can use:
- Method #1: Copying and pasting directly from the PDF file.
- Method #2: Using Microsoft Word.
- Method #3: Using a PDF Converter, such as Able2Extract.
Those 3 methods (particularly methods #2 and #3) are helpful in many situations you may encounter in your day to day work. You might, however, be wondering whether there’s a way to automate such processes.
If that’s the case, I’ve got good news for you:
It’s possible to use VBA for purposes of extracting data from PDF files to Excel. To be more precise, you can use VBA to automate the 3 methods of converting PDF to Excel that I mention above. I show you how to do this in the VBA tutorial below.
The following table of contents provides a more detailed list of the main topics I cover in this blog post:
All of the methods to extract data from PDF to Excel using VBA that I explain in this tutorial require that you use applications different from Excel (Adobe Reader, Able2Extract or Word). Therefore, let’s start by taking a quick look at the topic of…
Controlling Other Application From Excel With VBA
There are several ways in which you can control other applications from Excel. For example, in Mastering VBA for Microsoft Office 2016, author Richard Mansfield lists the following:
- #1: Automation.
- #2: Data objects.
- #3: Dynamic Data Exchange (DDE).
- #4: SendKeys.
- #5: The Command Line.
For purposes of this VBA tutorial, I only cover the following concepts related to this topic:
- SendKeys (#4 above).
More precisely, I use SendKeys when working with Adobe Reader or Able2Extract to convert a file from PDF to Excel.
- Automation (#1 above).
Automation is relevant for purposes of using Word when converting a PDF file to Excel.
I explain how to use SendKeys for purposes of communicating between applications first. However, for purposes of using SendKeys, you first need to use and understand the Shell Function. The reason for this is that you can only use SendKeys with an application that is currently open and running. SendKeys isn’t capable of opening an application by itself.
However, the Shell Function is able to run an application. Once the application is running, you can use SendKeys. I explain all of this in the following sections, starting with the…
Shell Function
You can use the Shell Function to achieve 2 things:
- #1: Run an executable program.
- #2: If the program runs successfully, obtain a Variant(Double) value that represents the task ID of the relevant program.
If the program doesn’t run successfully, an error occurs.
The task ID is “a unique number that identifies the running program”. You may find this ID useful for other purposes, such as using the TaskKill command to close the program or, as explained in Excel 2016 Power Programming with VBA, “activate the task”.
The basic syntax of Shell is as follows:
Shell(pathname[,windowstyle])
As you can see, the Shell Function uses the following 2 arguments:
- pathname.
pathname is a required Variant (String) argument. You use it to specify the following:
#1: The path and name of the program to be executed.
#2: Any arguments or command-line switches that may be required.
In other words, you generally use pathname to specify (i) the program to be executed and, if required, (ii) the file name and arguments required for purposes of executing the program.
If you’re working on a Mac, you can use the MacID Function for purposes of using the application signature (instead of its name) to specify the pathname argument.
- windowstyle.
windowstyle is an optional Variant (Integer) argument.
You use windowstyle for purposes of determining the style of the window in which the program you specify (using pathname) is executed. You can generally use the windowstyle values that appear in the table below. In all cases, I describe (i) what happens to the window of the newly executed program and (ii) which window has the focus (is active):
Value Constant Description 0 vbHide Program window is hidden. Focus is on hidden window.
1 vbNormalFocus New window is displayed on most recent size and position. Focus is on new window.
2 vbMinimizedFocus New window is displayed as an icon. Focus is on new window (icon).
3 vbMaximizedFocus New window is maximized. Focus is on new window.
4 vbNormalNoFocus New window is displayed in most recent size and position. Focus is on current active window .
6 vbMinimizedNoFocus New window is displayed as an icon. Focus is on current active window.
If you omit windowstyle, the default value is vbMinimizedFocus. In other words, the program starts minimized with focus.
If you’re working on a Mac, the following differences apply (particularly on the more recent iOS versions):
Difference #1: You can only use windowstyle to specify whether or not the relevant application has focus when executed.
Difference #2: The vbHide windowstyle constant (0) I show above isn’t generally applicable.
Therefore, in practice, when working with a Mac you can usually only choose between the following 2 options:
Option #1: Display the newly executed application in the foreground (with focus). For these purposes, you can use any of the 3 windowstyle values with focus (vbNormalFocus, vbMinimizedFocus or vbMaximizedFocus).
Option #2: Open the application in the background (without focus). For these purposes, you can generally use the windowstyle values without focus (vbNormalNoFocus, vbMinimizedNoFocus or vbHide).
The Shell Function has a characteristic that can be tricky to handle:
It runs the relevant program asynchronously. In other words, Visual Basic for Applications continues executing the statements that follow the Shell Function (without pausing) regardless of whether Shell has achieved its purpose (opening another application).
As explained by Richard Mansfield in Mastering VBA for Microsoft Office 2016:
When VBA executes a Shell statement, it registers the statement as an action to be performed— but that action may not necessarily be finished before the next statement in your VBA code executes.
Here’s why this can cause you problems:
- First, as I explain above, whenever you execute a particular program using the Shell Function, is possible that the newly executed program doesn’t execute prior to Visual Basic for Applications executing the statements that follow the Shell Function.
- Despite the above, the statements that follow Shell may rely on the Shell statement having been fully executed. In other words, those statements may need the application launched by Shell to be open before they can do anything.
- If (i) the statements that follow the Shell Function depend on Shell being executed but (ii) Shell hasn’t actually executed the relevant program at that point, an error occurs.
You don’t have to go too far to find situations where this is an issue. In fact, the VBA Sub procedures that I provide below serve as examples.
This, however, doesn’t mean that you can’t do anything to reduce the possibility of errors. As Richard Mansfield goes on to explain in the above-cited Mastering VBA for Microsoft Office 2016:
A crude but often-effective fix is to just allow extra time for the Shell function to execute before taking any dependent action.
There are a few ways in which you can allow for this extra time. For example:
- Mansfield himself suggests calling the Sleep API.
You can use the Sleep API to suspend execution of a thread for a certain amount of time.
- In the code examples below I use the Application.Wait method.
The Wait method allows you to pause a macro for a specified amount of time.
- You can also use a loop.
I don’t cover these topics in detail within this blog post. However, I may write about the topic of pausing or delaying VBA in more detail sometime in the future. If you want to receive an email whenever I publish new content in Power Spreadsheets, please make sure to register for our Newsletter by entering your email address below now:
Before we move on to SendKeys, let’s take a short look at the…
TaskKill Command
The main purpose of the TaskKill command is to end a task or process. You can use the Shell Function for purposes of killing the application once you’re done working with it.
Therefore, in the context of this VBA tutorial, we’re interested in how we can use TaskKill for purposes of closing the application that the Shell Function runs. As a consequence of this, I don’t explain everything there’s to know about the TaskKill command.
The syntax of the TaskKill command can get relatively complicated. However, within the context of this blog post, you just need to understand the following simplified syntax version:
TaskKill /F /Task
Within this simplified syntax version, “F” specifies that the relevant task should be forcefully terminated.
“Task” is where you specify what is the task or process to be terminated. There are 2 basic ways in which you can specify this:
- PID ProcessID.
This makes reference to the Process ID of the task or process you want to terminate.
If you use this alternative within the context of this tutorial, the relevant Process ID is the task ID returned by the Shell Function (as I explain above).
- IM ImageName.
In this alternative, you specify the process to be terminated by using its image (process) name. You generally have to use “.exe” when using this alternative.
This is the way in which I specify the task to be terminated in the code examples below. You can refer to them for purposes of seeing how I implement the TaskKill command to close both Adobe Reader and Able2Extract after using them to convert the data from a PDF file to Excel.
SendKeys Statement And Application.SendKeys Method
Visual Basic for Applications has both a SendKeys statement and a SendKeys method. They both behave in a very similar manner and, therefore, I explain both of them in this section.
Let’s start by taking a general look at the main similarities and differences between the statement and the method:
SendKeys Statement Vs. Application.SendKeys Method
As a general matter, you can use both the SendKeys statement and the Application.SendKeys method to send keystrokes to a particular window or application. This, in practice, has the same result as actually typing those statements on the keyboard.
As a general rule, you can’t use SendKeys to send keystrokes to an application that isn’t designed to be executed on a Windows or Mac platform. This is the case, for example, of DOS applications that run in a virtual machine.
SendKeys is also not able to work with applications that aren’t currently running. As I mention above, you must use the Shell Function for purposes of opening the application. Once the application is running, you can use SendKeys.
On a more specific level, according to the Microsoft Dev Center, you can:
- Use the SendKeys statement for purposes of sending keystrokes to the active window.
- Use the Application.SendKeys method for purposes of sending keystrokes to the active application.
Even though the description is almost identical, notice the difference at the end of the description of both constructs. Even though the active window and the active application may (and usually) make reference to the same, this isn’t always the case.
The active window and active application may be different.
So, what’s the bottom line?
- The SendKeys statement and the Application.SendKeys method are interchangeable and work the same in most cases.
- They’re, however, not redundant.
More precisely, in order to determine which one is appropriate to use, you must be aware whether you’re (i) controlling one application from another, or (ii) working within one application.
Let’s take a look at the …
SendKeys Syntax
The basic syntax of the SendKeys statement is as follows:
SendKeys string [, wait]
The syntax of the Application.SendKeys method mirrors the above:
expression.SendKeys(Keys, Wait)
When working with the Application.SendKeys method, “expression” is a variable representing an Application object. Other than this, SendKeys has 2 arguments:
- Argument #1: string (for the SendKeys statement) or Keys (for the SendKeys method).
This first argument (string or Keys) is required. You use it to specify the keystrokes that should be sent as a string.
- Argument #2: wait.
wait is an optional argument. You use it to determine whether Excel waits for the relevant keystrokes to be processed before control comes back to the macro. More precisely:
If you set wait to True, the keystrokes you specify with argument #1 (string or Keys) are processed. Only after this processing has been completed, control returns to the VBA procedure.
If you omit the wait argument (or set to its default value of False), the keystrokes are sent and control immediately returns to the macro. In this case, the VBA code doesn’t wait for the keystrokes to be processed and the macro execution simply continues.
Argument #2 is relatively straightforward. As I mention above, it’s a Boolean and you can only set it to True or False (or omit it).
However, you have a lot of options when specifying argument #1 (string or Keys). Therefore, let’s take a look at the 6 main rules you should consider when specifying the keystrokes you want the VBA application to send:
- Rule #1: As a general rule, you specify letters, numbers or other characters that are displayed upon you pressing the appropriate key, by including the appropriate character in your code.
For example, you specify the word Excel by including the string “Excel” in your VBA code.
The main exceptions you should be aware of are those I explain in rule #4 below.
- Rule #2: As a general rule, you specify keys that (i) don’t result in a character being displayed when you press the appropriate key or (ii) represent an action (instead of an actual character) by using the appropriate code. The table below includes several of the most relevant. You can also refer to the lists at the Microsoft Dev Center here (for the SendKeys statement) and here (for the SendKeys method).
Key SendKeys Code BACKSPACE {BACKSPACE}, {BS} or {BKSP} BREAK {BREAK} CAPS LOCK {CAPSLOCK} DELETE or DEL {DELETE} or {DEL} DOWN ARROW {DOWN} END {END} ENTER {ENTER} (numeric keypard) or ~ (tilde) ESC {ESCAPE} or {ESC} HELP {HELP} HOME {HOME} INSERT or INS {INSERT} or {INS} LEFT ARROW {LEFT} NUM LOCK {NUMLOCK} PAGE DOWN {PGDN} PAGE UP {PGUP} RETURN {RETURN} RIGHT ARROW {RIGHT} SCROLL LOCK {SCROLLLOCK} TAB {TAB} UP ARROW {UP} Function Keys (F1 through F16) {F#}, where # is the relevant number between 1 (for F1) through 16 (for F16) - Rule #3: When specifying that a particular keystroke must be repeated a certain number of times, you can generally use a shortcut. Instead of repeating the relevant keystroke several times, use the following syntax:
{key number}
Where (i) “key” represents the key you want to enter and (ii) “number” is the number of times the relevant key should be repeated.
- Rule #4: Despite rule #1 above, certain keys that have a special meaning should be surrounded by curly braces ({ }). The most relevant keys falling within this rule #4 are (i) plus (+), (ii) caret (^), (iii) parentheses (( )), (iv) tilde (~), (v) brackets ([ ]), and (vi) curly braces themselves ({ }).
Strictly speaking, square brackets ([ ]) don’t actually have a special meaning for SendKeys itself. The main reason why they’re subject to this rule is that they have a special meaning within other applications. This may be significant if there’s dynamic data exchange (DDE).
- Rule #5: If you want to specify a combination of (i) a particular key and (ii) Shift, Ctrl or Alt, apply the following codes:
Key SendKeys Code Shift + Ctrl ^ Alt % As I show in the examples further below, these codes are helpful for purposes of specifying keyboard shortcuts.
Note that this rule #5 applies only to combinations of a single key with Shift, Ctrl or Alt. In other words, only the keystroke that immediately follows the code above (for Shift, Ctrl or Alt) is considered to be pressed at the same time. The other keystrokes in the string that you specify are considered to be pressed separately.
If you need to hold Shift, Ctrl or Alt while more than 1 key is pressed, take a look at…
- Rule #6: Further to rule #5, you can specify that Shift, Ctrl or Alt should be held down while other keys are pressed. For these purposes, enclose those other keys in parentheses (( )). In other words, use the following syntax:
Code(Keys)
For these purposes, (i) “Code” is the code of the key you want to hold down, as they appear in the table under rule #5 above, and (ii) “Keys” are the other keys to be pressed while the one specified in the preceding (i) is held down.
When To Use SendKeys
SendKeys is a very basic way of controlling other applications from Excel. For example, in Mastering VBA for Microsoft Office 2016, author Richard Mansfield describes SendKeys as…
The oldest and most primitive method of communicating between applications.
In other sections, Mansfield uses adjectives such as “basic” and “limited” to further describe SendKeys.
You may not be surprised by such descriptions.
In fact, as Mansfield himself explains, SendKeys may cause certain issues. He highlights the following 2:
- Problem #1: Similar to what occurs with the Shell Function I explain above, SendKeys can often be the cause of timing issues.
- Problem #2: Testing code is more complicated and less reliable. In particular, you generally face limitations when trying to step through the code using the Visual Basic Editor.
You might then be wondering…
Should you use SendKeys at all? If so, when should you use SendKeys?
SendKeys should probably not be your VBA construct of choice for purposes of controlling other applications from Excel. However, in certain situations, it can be helpful. A useful general rule is provided in Mastering VBA for Microsoft Office 2016:
You may find SendKeys useful if neither Automation nor DDE works with the target application.
In the code examples that I explain below, SendKeys is used for purposes of (i) accessing the menus of other applications and the tools within those menus, or (ii) taking advantage of keyboard shortcuts within those applications.
VBA Code Example #1: PDF To Excel Using Adobe Reader
All of the macro examples within this tutorial extract the data within the Statistical Tables that accompany Christopher Dougherty’s book Introduction to Econometrics. You can get this PDF file here.
Let’s start with the first example…
The following sample macro (pdf_To_Excel_Adobe) uses Adobe Reader for purposes of converting the contents of the PDF file to Excel.
The basic structure of this VBA application is commonly found throughout the Internet whenever you search for examples of how to extract data from PDF to Excel.
This macro proceeds roughly as follows:
- Step #1: Opens the relevant PDF file using Adobe Reader.
- Step #2: Selects and copies the data.
- Step #3: Pastes the data into Excel.
Let’s take a look at each of the lines of code to understand the specifics behind this sample macro:
Line #1: Dim myWorksheet As Worksheet
Uses the Dim statement to declare the myWorksheet object variable as a Worksheet.
The main purpose of this line of code is to create an object variable that represents the Excel worksheet where the contents of the PDF file are pasted.
Lines #2 To #4: Dim adobeReaderPath As String | Dim pathAndFileName As String | Dim shellPathName As String
All of these lines use the Dim statement for purposes of declaring several variables (adobeReaderPath, pathAndFileName and shellPathName) as Strings.
The main purpose of these lines of code is to declare the variables that hold the 3 following strings:
- String #1: The path to the application used to open the PDF file.
In the case of this example #1, this application is Adobe Reader. The variable holding this string is adobeReaderPath.
- String #2: The path to the PDF file you want to convert to Excel.
All of the examples I use throughout this tutorial convert the file named “Statistical Tables”, as I explain above. This string is held by the pathAndFileName variable.
- String #3: The pathname argument used by the Shell Function.
For purposes of the VBA code examples within this tutorial, the general form of the pathname argument is as follows:
adobeReaderPath “pathAndFileName”
The resulting string is held by the shellPathName variable.
Line #5: Set myWorksheet = ActiveWorkbook.Worksheets(“Adobe Reader”)
Uses the Set statement for purposes of assigning the worksheet named “Adobe Reader” within the active workbook to the myWorksheet object variable.
The structure of this statement is as follows:
- Item #1: myWorksheet is originally declared in line #1 above.
- Item #2: The equal sign (=) acts as an assignment operator. It (i) takes the expression to its right side (item #3 below), and (ii) assigns it to the variable on its left side (item #1 above).
- Item #3: The reference to the relevant worksheet (Adobe Reader) is created by using the following VBA constructs:
Property #1: The Application.ActiveWorkbook property returns the active workbook.
Property #2: The Workbook.Worksheets property to return the worksheet named “Adobe Reader”.
Lines #6 And #7: adobeReaderPath = “C:Program FilesAdobeAcrobat Reader DCReaderAcroRd32.exe” | pathAndFileName = “C:…Power Spreadsheets ExamplesStatistical Tables.pdf”
Both lines of code use the assignment operator (=) to (i) assign the string on the right side of the operator to (ii) the variable on the left side.
In practical terms:
- Line #6 assigns the path of the application you use to open the PDF file (Adobe Reader in this case) to the adobeReaderPath variable.
adobeReaderPath is declared in line #2 above.
- Line #7 assigns the path of the PDF file you want to convert to Excel (Statistical Tables) to the pathAndFileName variable.
pathAndFileName is declared in line #3 above.
Line #8: shellPathName = adobeReaderPath & ” “”” & pathAndFileName & “”””
Just as the previous lines #6 and #7, uses the assignment operator to assign a string to a variable.
In this particular case, however:
- The string being assigned is built by using the concatenation operator (&) to concatenate 4 separate items:
Item #1: The string held by the adobeReaderPath variable. This is the path to the application used to open the PDF file.
Items #2 And #4: Double quotes (“).
Item #3: The string held by the pathAndFileName variable. This is the path to the PDF file you convert to Excel.
- The variable to which the string is assigned is shellPathName. This variable is declared in line #4 above.
Lines #9 To #11: Call Shell(pathname:=shellPathName, windowstyle:=vbNormalFocus)
Uses the Shell Function to run the program you use to open the PDF file you’re converting.
In other words, this is the line of code that opens the program you’re using. In this code example #1, this program is Adobe Reader.
The way in which the program is opened is further determined by the following 2 arguments:
- Argument #1: pathname:=shellPathName.
As I explain above, you use the pathname to specify the program to be executed and any relevant arguments or command-line switches. In this code example, pathname is set to be the string held by the shellPathName variable.
The value held by the shellPathName variable is as determined in line #8 above. Therefore, shellPathName holds a string containing the paths and names of (i) the executed program (Adobe Reader in example #1) and (ii) the PDF file to be converted to Excel (Statistical Tables).
- Argument #2: windowstyle:=vbNormalFocus.
You use the windowstyle argument for purposes of specifying the style of the window in which the relevant program is executed. In this case, I use the vbNormalFocus constant.
The consequence of this is that the program (Adobe Reader in this case), opens in a window that has (i) focus and (ii) its original size and position.
Lines #12 And #16: Application.Wait Now + TimeValue(“0:00:03”) | Application.Wait Now + TimeValue(“0:00:30”)
As I explain when introducing the Shell Function and SendKeys, one of the main weaknesses of these constructs is that they can be the source of timing problems.
I also mention how there are several ways in which you can handle these timing problems. Since this VBA tutorial focuses on how to convert files from PDF to Excel (instead of pausing or delaying VBA), I use a relatively simple construct for these purposes: The Application.Wait method.
Both of these lines of code (#12 and #16) use the Application.Wait method for purposes of pausing the execution of the sample macro until a particular time.
Application.Wait has a single required argument: Time. This is the time at which you want execution of the macro to resume.
In the case we’re looking at, Time is specified by using the following VBA constructs:
- Item #1: The Now Function, which returns the current date and time.
- Item #2: The TimeValue Function. TimeValue takes its string argument (time), and returns a Variant (Date) that contains the time.
When used together (as in this case), the Now and TimeValue Functions return a time that is a certain amount of time in the future. In the cases above:
- “Now + TimeValue(“0:00:03″)” returns a time 3 seconds into the future. In other words, execution of the macro is paused for 3 seconds.
- “Now + TimeValue(“0:00:30″)” returns a time 30 seconds from now. The consequence of this is that execution of the macro pauses for 30 seconds.
The times I’m using here are for illustrative purposes only. The computers you work with may be faster or slower than the ones I’ve used while testing the sample macro.
Inaccuracies in the amount of time you use as argument for the Application.Wait method can result in errors. For example, if I reduce TimeValue in line #16 from “0:00:30” (30 seconds) to “0:00:01” (1 second), Visual Basic for Applications displays the following dialog box:
Line #13: SendKeys “%vpc”
Uses SendKeys to enter the keyboard shortcut “Alt + V, P, C”.
If you’re working with Adobe Reader, this keyboard shortcut proceeds as follows:
- Step #1: Press “Alt + V”, which expands the View menu.
- Step #2: Expand the Page Display options.
- Step #3: Select Enable Scrolling.
You might be wondering why should you ensure that scrolling is enabled if the only thing you want to do is select all the text.
Let me explain:
The Select All command of Adobe Reader (which I use in the following line #14) doesn’t always select all the text in the open file. More particularly, if Single Page View is enabled, Select All usually only selects the contents of a single page.
Therefore, in order to ensure that the Select All command given in line #14 selects all the text within the PDF file (vs. a single page), I include this line #13 to guarantee that scrolling is enabled.
Note that the “Alt + V, P, C” keyboard shortcut used by this line #13 means that the Alt and V keys are pressed at the same time. You can alternatively send the keystroke “Alt, V, P, C”, where the keystrokes are sent sequentially. If you prefer using this second version of the keyboard shortcut, simply split apart the keystrokes sent by SendKeys using the following statements instead of line #13:
SendKeys “%”
SendKeys “vpc”
Line #14: SendKeys “^a”
Uses SendKeys for purposes of entering the keyboard shortcut “Ctrl + A”. Within Adobe Reader, this is the keyboard shortcut to select all.
Line #15: SendKeys “^c”
Uses SendKeys to enter the keyboard shortcut “Ctrl + C”. In Adobe Reader, this is the keyboard shortcut to copy.
Lines #17 And #20: With myWorksheet | End With
These are the opening and closing statement of a With… End With block.
The consequence of using the With… End With statement is that the series of statements within the block (lines #18 and #19 below) are executed on the object specified in the opening line of the block.
In the example we’re looking at, this means that both lines #18 and #19 below work with the object represented by myWorksheet. In this case, this object is the worksheet named “Adobe Reader” within the active workbook.
Line #18: .Range(“B4”).Select
Uses the Worksheet.Range property (myWorksheet.Range) and the Range.Select method (Range(“B4”).Select) for purposes of selecting cell B4 of the worksheet represented by the myWorksheet variable.
As a consequence of the assignment made in line #5 above, the worksheet where cell B4 is selected is that named “Adobe Reader”.
The range you select in this line of code (or its equivalent) determines where the contents of the PDF file are pasted in Excel.
Line #19: .PasteSpecial Format:=”Text”
Uses the Worksheet.PasteSpecial method (myWorksheet.PasteSpecial) to paste the contents of the Clipboard on the active worksheet as Text (Format:=”Text”).
You can generally use this method to paste data from other applications. Note that the Worksheet.PasteSpecial method (which we’re using in this example) is different from the Range.PasteSpecial method that I explain here.
The syntax of the Worksheet.PasteSpecial method is as follows:
expression.PasteSpecial(Format, Link, DisplayAsIcon, IconFileName, IconIndex, IconLabel, NoHTMLFormatting)
For purposes of this VBA tutorial, is enough if you’re aware of the following:
- “expression” is a variable representing a Worksheet object. This is myWorksheet in this case.
- Format is an optional parameter that allows you to specify “the Clipboard format of the data”. You specify Format as a string.
In the example we’re looking at, Format is set to “Text”.
This statement, is roughly the equivalent of (i) launching the Paste Special dialog box (by, for example, pressing the “Ctrl + Alt + V” keyboard shortcut), (ii) selecting Text, and (iii) pressing the OK button.
Note that none of the parameters of the Worksheet.PasteSpecial method allows you to select a destination for pasting. Since there’s no such parameter, Worksheet.PasteSpecial pastes the Clipboard contents in the current selection.
As a consequence of this, you must select the destination range prior to calling the PasteSpecial method. In the sample pdf_To_Excel_Adobe macro, this is selection is made by the statement in line #18 above.
Line #21: Call Shell(“TaskKill /F /IM AcroRd32.exe”, vbHide)
Calls the Shell Function to invoke the TaskKill command. The TaskKill command, in turn, kills Adobe Reader.
The windowstyle parameter of the Shell Function is set to vbHide. As a consequence of this, you only see the relevant application (Adobe Reader in this case) closing.
Now that we’ve covered all of the VBA statements used by the sample pdf_To_Excel_Adobe macro, let’s take a look at the practical results obtained when executing the Sub procedure:
PDF To Excel Using Adobe Reader: Example Results
The following image shows the results I obtain when executing the sample pdf_To_Excel_Adobe VBA Sub procedure.
This VBA Convert PDF to Excel Tutorial is accompanied by Excel workbooks containing the data and macros I use. You can get immediate free access to this example workbook by subscribing to the Power Spreadsheets Newsletter.
In most situations, this isn’t the result that you want. However, if you’re relying on Adobe Reader for purposes of converting a PDF file to Excel, you’re unlikely to obtain better results.
The reason for this is that, as a general rule, whenever you copy data from Acrobat Reader and paste it into Excel (which is what this sample macro does), the data ends up in a single column. This is precisely what happens in the image above.
The following sample macro solves this problem by using a PDF Converter: Able2Extract. Let’s take a look at how you can do this:
VBA Code Example #2: PDF To Excel Using Able2Extract
The sample macro (pdf_To_Excel_Able2Extract) that appears further below follows roughly the same as the previous one (pdf_To_Excel_Adobe) with one major difference:
It relies on Able2Extract instead of Adobe Reader for purposes of opening the PDF file and copying the data. In other words, the VBA code that appears below proceeds as follows:
- Step #1: Opens the PDF file you want to convert using Able2Extract.
- Step #2: Selects and copies all the data in the file.
- Step #3: Pastes the data into Excel.
You can try Able2Extract for free here.
The VBA code of this sample pdf_To_Excel_Able2Extract macro is almost the same as that of pdf_To_Excel_Adobe. The following image highlights the differences between both pieces of code:
The differences between both macros can be reduced to the 2 that I explain in the following sections. Other than the clarifications provided in these sections, the explanations I provide for each of the VBA statements in the pdf_To_Excel_Adobe sample macro above generally (also) apply to the statements within the pdf_To_Excel_Able2Extract Sub procedure.
Let’s take a look at the differences between both macros:
Difference #1: References To The Relevant Application
The pdf_To_Excel_Adobe macro uses Adobe Reader for purposes of (i) opening the PDF file you want to convert, and (ii) copying all of its data. The pdf_To_Excel_Able2Extract macro relies on Able2Extract (instead of Adobe Reader) for those same purposes.
As a consequence of the above, the VBA code behind both of these sample macros makes reference to different applications. pdf_To_Excel_Adobe refers to Adobe Reader. pdf_To_Excel_Able2Extract refers to Able2Extract.
This difference is reflected in 3 different places within the VBA code of the sample macros. Only 1 of these changes (#3) has a material effect on the way the macro proceeds.
- #1: The name of the variable that holds the path and name of the relevant application (Adobe Reader or Able2Extract) changes from adobeReaderPath to able2ExtractPath.
This change isn’t really material and I’ve done it for clarity purposes only. In fact, I could’ve used the same variable name for both macros.
Generally speaking, what’s more relevant is that you ensure compliance with the VBA variable naming rules.
- #2: The name of the destination worksheet changes from “Adobe Reader” to “Able2Extract”.
Again, the change isn’t really material. I’ve used different names because the sample workbook that accompanies this tutorial displays the results obtained when executing each macro in separate worksheets. There’s a worksheet named “Adobe Reader”, where the results obtained with the pdf_To_Excel_Adobe macro appear. And there’s a worksheet named “Able2Extract”, where I show the results obtained when running this pdf_To_Excel_Able2Extract macro.
In any case, the object you assign to the myWorksheet variable will, in the end, depend on the particular context you’re in and how you want to design your VBA Sub procedure. Therefore, you’re likely to modify this part of the code anyway.
- #3: The path and name of the application you’re using to open the PDF file and copy its contents changes. The pdf_To_Excel_Adobe macro makes reference to the path and name of Acrobat Reader. The code of the current example (pdf_To_Excel_Able2Extract) refers to the path and name of Able2Extract.
As I anticipate previously, this is the only of the 3 differences we’re looking at here that has a material effect.
In other words, you have flexibility when naming your variables and selecting the destination where you paste the data. However, you must ensure that you’re making reference to the correct application when determining and structuring the pathname parameter of the Shell Function and the process name killed by the TaskKill command (if required).
Before we move on to the second difference between the code of the macro examples, note the following:
As I explain when introducing TaskKill, you have 2 ways of specifying the task to be terminated:
- #1: Using the Task ID returned by the Shell Function.
- #2: Using the name of the process to be terminated.
The code examples in this blog post use alternative #2 (process name). If the sample VBA code relied on the Task ID to identify the task to be terminated (alternative #1), the last line of code would be the same in both sample macros (pdf_To_Excel_Adobe and pdf_To_Excel_Able2Extract).
Difference #2: Keyboard Shortcuts Used
The second group of differences between the sample VBA code for the macros pdf_To_Excel_Adobe and pdf_To_Excel_Able2Extract is related to the use of SendKeys.
More precisely, the keyboard shortcuts you need to use when working with Adobe Reader differ slightly from those you need to use when using Able2Extract. Therefore, the VBA code must reflect these differences when using SendKeys.
As I explain above, the sample pdf_To_Excel_Adobe macro uses 3 keyboard shortcuts:
- Shortcut #1: “Alt + V, P, C” (SendKeys “%vpc”).
This keyboard shortcut enables scrolling.
- Shortcut #2: “Ctrl + A” (SendKeys “^a”).
This shortcut selects all.
- Shortcut #3: “Ctrl + C” (SendKeys “^c”).
This is the keyboard shortcut for the Copy command.
When working with Able2Extract, you don’t need to enable scrolling (shortcut #1 above). The reason for this is that, within Able2Extract, the “Ctrl + A” keyboard shortcut selects all pages.
Within Able2Extract, “Ctrl + C” is the keyboard shortcut for copying.
As a consequence of the above considerations, when using Able2Extract, you only need to include statements that use SendKeys to transmit the keystrokes corresponding to the “Ctrl + A” (#1 in the image below) and “Ctrl + C” (#2 below) keyboard shortcuts.
The differences between the VBA code of the sample macros (pdf_To_Excel_Adobe and pdf_To_Excel_Able2Extract) are relatively minor. However, as I show in the following section, the results obtained when using Able2Extract are significantly better to those obtained when relying on Adobe Reader.
Let’s take a look at them:
PDF To Excel Using Able2Extract: Example Results
The following screenshot shows the results I obtained when executing the example pdf_To_Excel_Able2Extract macro:
Compare this with those I obtained previously (with the pdf_To_Excel_Adobe macro example) and notice the improvement.
A particularly significant improvement is the fact that the macro that relies on Able2Extract is able to replicate the structure of the table in the original PDF document. The macro that uses Adobe Reader for these purposes generally pastes all the values in a single column.
Furthermore, the macro that relies in Able2Extract is usually able to extract all the values from the source PDF document. This aspect is particularly relevant when comparing the performance of this macro example #2 (pdf_To_Excel_Adobe) and macro example #3 (which I explain below and relies on Microsoft Word instead of Able2Extract).
Granted, there are situations where you may still need to do some data cleansing after converting the data from PDF to Excel using VBA and Able2Extract. However, in my experience, the results obtained when relying on Able2Extract (macro example #2) are usually more accurate and easier to fix/clean than those obtained when relying on Adobe Reader (macro example #1) or Microsoft Word (macro example #3).
A final advantage of Able2Extract is that you can use its variety of keyboard shortcuts when working with SendKeys. This gives you more flexibility for purposes of determining which is the exact portion of data you want to extract. Notice, for example, how you can use keyboard shortcuts to execute virtually all of the commands within the Edit and Document menus of Able2Extract:
Now that we’ve covered the topic of converting PDF to Excel using SendKeys alongside Adobe Reader or Able2Extract, let’s move on to the next way of convert PDF files to Excel: with a recent version of Microsoft Word. For these purposes, let’s start by taking a look at…
The Microsoft Word Object Model: A Quick Introduction
Before you start reading the sections on how to use Word for PDF conversion purposes, please note that you need to have at least Word 2013 installed in your computer. The previous versions of Microsoft Word don’t have the PDF Reflow feature that converts PDF files to editable Word documents.
I cover object models in several posts within Power Spreadsheets. Some of the object models I’ve written about are the following:
- The Excel object model.
- The Outlook object model.
- The Ribbon Commander object model.
As I explain in any of those posts, is important to have a good understanding of the object model of the application(s) you’re working with. The main reason for this is that, whenever you’re working with VBA, you’re basically manipulating the objects of that application.
Word isn’t an exception to this general rule.
Just as Excel, Outlook or the Ribbon Commander, Word has its own object model. Therefore, in this section, I provide a short introduction to this particular model.
What Is The Word Object Model
As a general rule, you can picture a VBA object model as being a hierarchy. This hierarchy contains all of the objects you can use.
Object model hierarchies have an object at the top. That object contains other objects. Those objects may, in turn, contain other objects.
This process goes on until. Therefore, any object within the object model hierarchy has the potential of containing other objects. In any case, you eventually reach a point where a particular object doesn’t hold any other objects.
This basic idea is pretty much the same regardless of which particular VBA object model you’re looking at. In other words, even though the exact components of the Word object model differ from those of the Excel object model, their structure follows the general rule I explain above.
The object at the top of a VBA object model is usually known as Application. The Application object represents the relevant application. More precisely:
- Word’s Application object represents (and returns) Word.
- Excel’s Application object represents (and returns) Excel.
Since Word’s Application object contains other objects, you can think of it as the root from which you access all the other objects of Word’s object model. You generally obtain access to Word’s top-level objects by appropriately using the properties of the Application object.
For purposes of this VBA tutorial, 2 particular properties (and their associated objects) are very relevant:
- Property #1: The Application.Documents property.
- Property #2: The Application.ActiveDocument property.
Let’s take a closer look at each of these:
The Application.Documents Property, The Documents Collection And The Document Object
The main purpose of the Application.Documents property is to return a Documents collection.
The basic syntax of Application.Documents is as follows:
expression.Documents
“expression” is a placeholder for the Application object.
The Documents object returned by the Application.Documents property is a collection representing all the Document objects currently open. Each of those Document objects represents an individual document.
For example, the following image shows 5 open Word documents. In this case, the Documents collection represents all of these documents (Document1 through Document5). Each of the individual documents (Document1, Document2, Document3, Document4 and Document5) is represented by an individual Document object.
Application.ActiveDocument Property
The main purpose of the Application.ActiveDocument property is to return the Document object that represents the current active Word document.
The basic syntax of Application.ActiveDocument follows that of the Application.Documents property above:
expression.ActiveDocument
“expression” stands for an Application object.
Automation, Early And Late Binding: An Introduction
Automation is probably the most commonly used (and the preferred) tool for purposes for purposes of controlling one Windows application from another.
When using Automation for purposes of manipulating one application from another one, you refer to the relevant applications as follows:
- The Server or Object Application is the one that provides a particular set of features you want to use.
Within the context of this blog post, the Server Application is Word.
- The Client or Controlling Application is the one that uses the features of the Server or Object Application.
For purposes of this tutorial, the Client Application is Excel.
Using Automation has several advantages over relying on the Shell Function and SendKeys. In this context, a particularly important advantage of Automation is that it allows you to programmatically access the Component Object Model (COM) objects offered by the Server Application.
In the words of author Richard Mansfield (in Mastering VBA for Microsoft Office 2016):
Automation lets the client application harness the built-in capabilities of the server application.
In the case we’re looking at, you use Excel to harness the built-in capabilities of Word. The purpose of doing this, in this case, is to convert the contents of a PDF file to Excel.
For purposes of using Automation, the main thing you need to is create a connection between the relevant applications. In this particular case, those applications are Word and Excel.
You create such connection by using either of the following types of binding:
- Early binding: In early binding you set up a reference to the relevant (Word’s in this case) object library. Once you’ve created the reference, the objects, properties and methods within that library are available while you’re coding the relevant VBA application.
- Late binding: When using late binding, you don’t need to create a reference to the relevant object library while creating your macro. Instead of this, you create an object that references that object library upon execution.
In the sections below, I provide samples of VBA code that use both early and late binding. In the section containing the code sample that uses early binding, I also show how you can create a reference to Word’s object library.
VBA Code Example #3: PDF To Excel Using Microsoft Word With Early Binding
The following sample macro (pdf_To_Excel_Word_Early_Binding) converts a PDF file to Excel by relying on a recent version of Microsoft Word for conversion purposes.
The macro basically proceeds as follows:
- Step #1: Opens the PDF file as an editable Word document.
- Step #2: Copies the contents of the Word document.
- Step #3: Pastes the Clipboard contents to Excel.
The basic steps of this process (open file, copy and paste) are roughly the same as the steps followed by the previous macro examples #1 (using Adobe Reader) and #2 (with Able2Extract). Therefore, you may notice that some lines of code are substantially the same as those used by the previous samples.
I explain each of the lines of code below. When appropriate, I refer to the equivalent lines of code in previous sections.
However, since this particular macro uses early binding, you must start by setting up a reference to the relevant object libraries. Let’s start by taking a look at this:
How To Set Up A Reference To The Microsoft Word Object Library And The Windows Script Host Object Model
Since the sample pdf_To_Excel_Word_Early_Binding macro relies on Word for purposes of carrying out the PDF to Excel conversion, you must set up a reference to the Microsoft Word Object Library.
However, that’s not the only reference you need:
Due to a quirk I explain below, you also need to set up a reference to the Windows Script Host Object Model if you want to implement this VBA code example #3 and/or the following sample #4.
In order to create these references, go to the Visual Basic Editor and follow these 4 easy steps:
- Step #1: Go to the Tools menu and select “References…”.
- Step #2: Search for the appropriate Microsoft Word Object Library (the label includes the version) within the Available References box. Once you’ve found it, select it (click on the check box on the left side).
The following screenshot is how the dialog box I’m working with looks like.
I’m using Office 2016. Therefore, the label of the Microsoft Word Object Library refers to Microsoft Word 16.0. If you’re using a different version, the version number changes.
- Step #3: Search for the Windows Script Host Object Model and select it.
- Step #4: Click on the OK button on the upper right corner of the References dialog box to confirm your selections.
Once you’ve set the object reference, you can access the Word object model and use the objects, methods and properties it contains. Therefore, let’s take a look at each of the lines of code within the sample pdf_To_Excel_Word_Early_Binding macro.
Line #1: Dim myWorksheet As Worksheet
Uses the Dim statement for purposes of declaring the myWorksheet object variable as a Worksheet.
The purpose of the myWorksheet object variable is to represent the Excel worksheet where the data within the PDF file is pasted.
This line of code is substantially the same as line #1 of the previous macro examples.
Line #2: Dim wordApp As Word.Application
Declares the wordApp object variable as of the type Word.Application.
The main purpose of this line of code is to create a variable representing the Word application in order to expose the Word object model.
Line #3: Dim myWshShell As wshShell
Declares the myWshShell object variable as of the type WshShell.
The purpose of this line of code is similar to that of the previous line #2. More precisely, this line #3 creates the variable that provides access to the native Windows shell. I explain why myWshShell is required when explaining line #13 below.
Lines #4 To #6: Dim pathAndFileName As String | Dim registryKey As String | Dim wordVersion As String
These lines use the Dim statement to declare several variables (pathAndFileName, registryKey and shellPathName) as Strings.
Line #4 (Dim pathAndFileName As String) is exactly the same as line #3 of the previous macro examples. Its purpose is to declare a variable that holds the path to the PDF file you want to convert to Excel.
The purpose of the remaining lines #5 and #6 is as follows:
- Line #5: Declares the registryKey variable.
This variable holds a relatively long string (see line #12 below). This variable appears in lines #13 and #17 below. In those lines, the RegWrite method is used to ensure that Word opens the PDF file you want to convert without displaying a dialog box warning you of the conversion.
- Line #6: Declares wordVersion.
The purpose of wordVersion is to hold the version number of the Word version that you’ll use when executing the macro. This value is assigned in line #11 below.
Line #7: Set myWorksheet = ActiveWorkbook.Worksheets(“Word Early Binding”)
This line of code is substantially the same as line #5 from the previous macro examples.
It uses the Set statement to assign a particular worksheet (Word Early Binding) of the active workbook to the myWorksheet object variable.
Lines #8 And #9: Set wordApp = New Word.Application | Set myWshShell = New wshShell
Both of these lines of code use the Set statement for purposes of assigning a new (New) instance of the relevant object to the applicable object variable. More precisely:
- Line #8 assigns a new instance of the Word application (Word.Application) to the wordApp variable.
- Line #9 assigns a new instance of wshShell to the myWshShell object variable.
This assignment is achieved as follows:
- Step #1: The statements create a new instance of Word.Application or wshShell by using the New keyword.
- Step #2: The assignment operator (=) (i) assigns the expression to its right side (New Word.Application or New myWshShell) to (ii) the variable to its left side (wordApp or myWshShell).
Line #10: pathAndFileName = “C:…Power Spreadsheets ExamplesStatistical Tables.pdf”
This line is exactly the same as line #7 in the previous macro examples.
It uses the assignment operator (=) to (i) assign the path of the PDF file you want to convert to Excel (Statistical Tables.pdf) to (ii) the pathAndFileName variable. pathAndFileName is declared in line #4 above.
Line #11: wordVersion = wordApp.Version
This line of code uses Word’s Application.Version property (wordApp.Version). This property returns the version number of the relevant Word version.
That version number returned by the Version property is assigned to the wordVersion variable.
The version number is relevant for purposes of the next…
Line #12: registryKey = “HKCUSOFTWAREMicrosoftOffice” & wordVersion & “WordOptions”
Uses the assignment operator (=) for purposes of (i) assigning the expression to the right of the operator to (ii) the registryKey variable.
The expression on the right side of the assignment operator is created by concatenating the following 3 items:
- Item #1: The string “HKCUSOFTWAREMicrosoftOffice”.
- Item #2: The version number held by the wordVersion variable (see line #11 above).
- Item #3: The string “WordOptions”.
For example, if you’re using Office 2016, the string assigned to registryKey is as follows:
“HKCUSOFTWAREMicrosoftOffice16.0WordOptions”
This string is, in turn, used by…
Line #13: myWshShell.RegWrite registryKey & “DisableConvertPdfWarning”, 1, “REG_DWORD”
In order to understand why this statement is necessary, let’s take a look at what happens when you manually open a PDF file using Word:
In such cases, a dialog box is (usually) displayed. This box informs you about 2 things:
- #1: That Word is converting the PDF file to an editable Word document.
- #2: That (i) the conversion may take a while, and (ii) the Word document may not look exactly like the original file.
In order for Word to proceed with the PDF to Word conversion, you must press the OK button within the displayed dialog box.
Additionally, you have the option of specifying that the message shouldn’t be displayed again. To do this, you just need to check the relevant box that appears on the left side of the dialog box.
Clicking the OK button manually is quite easy. At first glance, this is something that would also be easy to handle using VBA.
However, here’s the deal:
The key that corresponds to the value written to the registry doesn’t initially exist. In order to proceed with the PDF to Word conversion, you must solve this.
And this line #13 is the one that helps you solve this issue. Let’s see how:
The statement uses the RegWrite method of the WshShell object. You can use the RegWrite method for 3 main purposes:
- Purpose #1: Create a new key.
- Purpose #2: Add another value-name to an existing key, and assign it a value.
- Purpose #3: Change the value of an existing value-name.
The basic syntax of RegWrite is as follows:
object.RegWrite(strName, anyValue [,strType])
For these purposes:
- object is a WshShell object.
In this line #13, this object is myWshShell.
- strName is the string value that you use to indicate the key-name, value-name or value you want to work with.
Within the sample code, this string is created by concatenating (i) the string held within the registryKey variable, and (ii) “DisableConvertPDFWarning”. The string assignment for registryKey was made in the previous line #12. As explained by MVP Graham Mayor, the string assigned to strName is the value written to the registry.
In order to specify a key-name, add a backlash () at the end of strName. To specify a value-name (as in the case we’re analyzing), specify strName without a backlash at the end.
- anyValue is either (i) the name of the key you’re creating, (ii) the name of the value you’re adding to an existing key, or (iii) the new value you’re assigning to an existing value-name.
Within line #13, this corresponds to 1.
- strType is an optional parameter that you can use to indicate the data type of the value.
The data type of the value within the statement we’re looking at is REG_DWORD. This strType corresponds to Integer. It allows you to specify a number in the form of an integer.
The practical effect of this line of code #13 is that the warning in the dialog box that I explain above (regarding the PDF to Word conversion) is turned off. Therefore, Visual Basic for Applications is able to proceed with…
Lines #14 To #16: wordApp.Documents.Open Filename:=pathAndFileName, ConfirmConversions:=False
Lines #14 to #16 are a single statement. The statement calls the Documents.Open method. This method allows you to:
- #1: Open a document; and
- #2: Add the newly opened document to the Documents collection.
The basic syntax of Documents.Open is as follows:
expression.Open(FileName, ConfirmConversions, ReadOnly, AddToRecentFiles, PasswordDocument, PasswordTemplate, Revert,WritePasswordDocument, WritePasswordTemplate, Format, Encoding, Visible, OpenConflictDocument, OpenAndRepair,DocumentDirection, NoEncodingDialog)
Notice that the statement we’re analyzing only makes reference to 2 of the parameters of Documents.Open:
- Filename.
Filename is the only required parameter of the Documents.Open method. You use it to specify the name of the document you want to open and add to the Documents collection.
In the sample pdf_To_Excel_Word_Early_Binding macro, the value of this parameter is that contained within the pathAndFileName. The value assigned to pathAndFileName in line #4 above corresponds to the path and file name of the PDF document you’re converting to Excel.
- ConfirmConversions.
ConfirmConversions is an optional parameter. It allows you to determine whether the Convert File dialog box is displayed or not.
This parameter can be set to True (display) or False (don’t display). In this code example, it’s set to False.
As a consequence of the above, the practical effect of lines #14 to #16 is that Word opens the PDF file you want to convert to Excel.
Line #17: myWshShell.RegWrite registryKey & “DisableConvertPdfWarning”, 0, “REG_DWORD”
This line of code is substantially the same as line #15 above.
The only difference between lines #15 and #17 is the value assigned to the anyValue parameter of the RegWrite method.
- In line #15 above, it’s 1.
- In this line #17, it’s 0.
In practice, line #17 reverses what line #15 does. In other words, it turns the warning regarding the PDF to Word conversion back on.
Other than the above, the general aspects within the explanation I provide in line #15 above are applicable.
Line #18: wordApp.ActiveDocument.Content.Copy
Uses Word’s Range.Copy method for purposes of copying the content of the file you want to convert.
The basic syntax of the Range.Copy method is as follows:
expression.Copy
For these purposes, “expression” represents a Word Range object. Within this line #18, this Range object is that returned by the Document.Content property.
When you use Document.Content, it returns a Range object “representing the main document story”. The Document object to which the Content property makes reference to is the active word document (returned by Word’s Application.ActiveDocument property).
Lines #19 And #22: With myWorksheet | End With
These 2 lines of code are the opening and closing statements of a With… End With block.
The practical effect of With… End With is that the series of statements within the block are executed on the same object. In the case of the sample macro we’re looking at:
- The statements within the With… End With block are the following lines #20 and #21.
- The object they’re executed on is that represented by myWorksheet (specified in line #19).
The whole With… End With block composed by lines #19 to #22 of this sample macro has substantially the same effect as lines #17 to #20 of the previous macro examples.
Line #20: .Range(“B4”).Select
This line of code is the same as line #18 of prior code examples. It uses the Range.Select method to select cell B4 of myWorksheet.
This line sets the location where the data from the PDF file you’re converting is pasted. This statement is needed because it’s required by the Worksheet.PasteSpecial method used by…
Line #21: .PasteSpecial Format:=”Text”
This line of code is exactly the same as line #19 of the previous sample macros.
More precisely, it uses the Worksheet.PasteSpecial method for purposes of pasting the contents of within the Clipboard on the selected range (see line #20 above) using a particular format. In the example we’re looking at, Format is set to “Text”.
Line #23: wordApp.Quit SaveChanges:=wdDoNotSaveChanges
Uses Word’s Application.Quit method to close Word without saving the file changes.
The syntax of the Quit method is as follows:
expression.Quit(SaveChanges, Format, RouteDocument)
For purposes of this tutorial, you only need to be aware of the following 2 items:
- “expression”, which is a variable that represents the Word Application object. In this case, expression is wordApp.
- The SaveChanges parameter, which allows you to specify whether Word saves the changes to the document before quitting. You can set the SaveChanges parameter to any of the following 3 values from the WdSaveOptions Enumeration:
Name Value Description wdDoNotSaveChanges 0 Don’t save changes. wdPromptToSaveChanges -2 Ask the user whether the changes should be saved. wdSaveChanges -1 Save changes. In this case, I set SaveChanges to wdDoNotSaveChanges. Therefore, Word doesn’t save the changes (the conversion) of the PDF file.
Lines #24 And #25: Set wordApp = Nothing | Set myWshShell = Nothing
Both lines of code use the Set statement to set the wordApp and myWshShell object variables to Nothing. The reason these statements are included is to release the memory.
VBA Code Example #4: PDF To Excel Using Microsoft Word With Late Binding
The following sample macro (pdf_To_Excel_Word_Late_Binding) is the equivalent to that in the previous example #3. The only difference is that, instead of using early binding, this sample #4 uses late binding.
The macro, therefore, follows exactly the same 3-step process to convert the PDF file to Excel:
- Step #1: Open the PDF file as an editable Word document.
- Step #2: Copy the data within the Word document.
- Step #3: Paste the contents of the Clipboard to Excel.
There are only a few differences between the 2 macro examples that rely on Word to convert a PDF file. I highlight them in the following image:
I explain these differences in the following sections. Other than these clarifications, the comments I provide above when explaining macro example #3 (with early binding) are roughly applicable to this macro.
When using late binding, you don’t need to set up a reference to the Word Object Library or the Windows Script Host Object model.
Let’s take a look at the code changes:
Lines #2 And #3: Dim wordApp As Object | Dim myWshShell As Object
Both of these lines of code use the Dim statement to declare an object variable. Furthermore, in both cases, the object variable is declared as an Object. This declaration (as Object) forces late binding.
These lines of code are the equivalent of lines #2 and #3 of the previous macro example #3 that uses early binding. In that case, however:
- WordApp is declared As Word.Application.
- myWshShell is declared As wshShell.
Line #7: Set myWorksheet = ActiveWorkbook.Worksheets(“Word Late Binding”)
The line #7 of this example #4 makes reference to the worksheet named “Word Late Binding”. Line #7 of the previous example makes reference to “Word Early Binding”.
The consequence of this difference is that both macros paste the data from the PDF file in different Excel worksheets.
This difference, however, isn’t strictly related to the topic of early binding vs. late binding. It’s simply a consequence of the way in which I create and organize the sample workbooks that accompany this tutorial.
Lines #8 And #9: Set wordApp = CreateObject(“Word.Application”) | Set myWshShell = CreateObject(“WScript.Shell”)
Both lines of code achieve the following:
- Step #1: Create and return a reference to the relevant ActiveX object you need to work with. For purposes of this example, these objects are the Word Application (Word.Application) and the Windows Shell (WshShell).
In both cases, this is achieved by using the CreateObject Function.
- Step #2: Assign the object references created in step #1 to the wordApp and myWshShell object variables.
In the previous macro example #3 (with early binding), the object references are created in the same lines #8 and #9. However, in those cases, the object reference is created by using the New keyword instead of the CreateObject Function. Therefore:
- wordApp is set to New Word.Application.
- myWshShell is set to New wshShell.
Line #23: wordApp.Quit SaveChanges:=0
This line of code achieves exactly the same purpose as line #23 of the previous macro example #3. The consequence of this line is, therefore, that Word closes without saving the file changes.
The only difference between the syntax used in the macro that uses early binding and here is the way in which the value of the SaveChanges parameter is specified. In the previous example, this was specified as wdDoNotSaveChanges. Here, it’s assigned the value of 0.
Since the value 0 corresponds to wdDoNotSaveChanges, they’re both equivalent.
Now that we’ve covered how you can use Word to extract data from PDF to Excel using VBA, let’s take a look at the results I obtain when executing the sample macros #3 and #4:
PDF To Excel Using Microsoft Word: Example Results
The VBA code examples #3 and #4 are materially the same. The only difference between them is the fact that #3 uses early binding, whereas #4 uses late binding.
As a consequence of the above, the results I obtain when executing both macros are substantially the same. Therefore, in the images below, I only show the results I obtain when executing the early binding macro version.
The following screenshot shows the first section of the results I obtain when executing the sample pdf_To_Excel_Word_Early_Binding macro.
At first glance these results seem to be very good. In particular, (i) the values seem to be appropriately extracted and (ii) the structure of the original file (including its tables) seems to be well replicated.
However, upon scrolling down the file, some issues become evident. Check out, for example, the following screenshot and notice how the 2 rows I highlight.
The following screenshot shows how this data looks in the original PDF file.
As evidenced by the above screenshots and previous conversion results within this blog post (here and here), the macro example that relies on Word for purposes of converting a PDF file to Excel is generally:
- Less accurate than the macro example that relies on Able2Extract.
- More accurate than the sample macro that uses Adobe Reader.
Microsoft explains the reasons why Word may not accurately convert a particular PDF file here. In very general terms, the conversion feature that Word uses (known as PDF Reflow) “works best with files that are mostly text”.
The following are some of the items where a conversion that relies in Microsoft Word is more prone to inaccuracies:
- Tables with cell spacing.
- Footnotes that span more than 1 page.
- Endnotes.
- PDF comments.
- Pages or sections that are mostly charts or other graphs. In such cases, the page or section may be transferred to Excel as an image.
Despite the above weakness, relying on Microsoft Word for purposes of creating a macro that converts PDF to Excel has some advantages (vs. macros that use SendKeys):
For example, by using Microsoft Word to convert a PDF file to Excel, you gain access to all the features and tools contained within the Microsoft Word Object Model. Furthermore, by avoiding the Shell Function and SendKeys, you prevent the timing issues that are sometimes associated to their use.
By now, you probably have a very good idea of the pros and cons of each of the different approaches I cover in this VBA tutorial. Therefore, is a good time to check out…
PDF To Excel Using VBA: Which Approach To Use
After learning about the different applications you can rely on for purposes of converting a PDF file to Excel (Adobe Reader, Able2Extract and Word), you may (rightly) be wondering whether one is better than the others.
In my opinion, each of the 3 approaches I explain in this tutorial has different advantages and disadvantages.
Despite the above, in my experience, the results you obtain when relying on Adobe Reader are likely to be of lower quality and require more data cleansing than those obtained with Able2Extract or Word.
In order to choose between Word and Able2Extract, I generally suggest that you consider the particular characteristics of the situation you’re in. Some of the criteria you may consider are the following:
- Criterion #1: Whether you have access to both applications (Able2Extract or a recent version of Microsoft Word).
This question may seem relatively obvious, but the computer where your VBA application will be running needs to have installed the application you’ll be using (Able2Extract or a recent version of Microsoft Word) for the macro examples I propose above to work.
- Criterion #2: Length and complexity of the PDF data you want to bring into Excel.
If the PDF files you’re converting are particularly lengthy and complex, Able2Extract may provide better conversion results. As I explain above, Word is prone to inaccuracies when the PDF file has items such as (i) tables with cell spacing, (ii) footnotes that span more than 1 page, (iii) endnote, (iv) PDF comments, or (v) pages or sections that are mostly charts or other graphs.
- Criterion #3: How many PDF to Excel conversions do you need to carry out.
If you’re constantly converting PDF files to Excel, you may find that the investment in a PDF converter such as Able2Extract is worth it.
- Criterion #4: Whether you’d like to use the items within the Microsoft Word Object Model.
As I explain above, one of the advantages of relying on Microsoft Word, is that you can access its object model. In certain scenarios, this may give you a good amount of flexibility and control over how you handle the PDF to Excel conversion.
Conclusion
After reading this VBA tutorial, you know at least 3 ways in which you can convert PDF to Excel using VBA. More precisely, you know how to leverage the following applications for purposes of extracting data from a PDF file to Excel:
- Adobe Reader.
- Able2Extract.
- Microsoft Word.
This VBA Convert PDF to Excel Tutorial is accompanied by Excel workbooks containing the data and macros I use above. You can get immediate free access to this example workbook by subscribing to the Power Spreadsheets Newsletter.
In addition to the above, you’ve also read about the following topics:
- The Shell Function and TaskKill Command.
- SendKeys.
- The Microsoft Word Object Model.
- Automation, Early and Late Binding.
For purposes of this VBA tutorial, I’ve excluded approaches that require relying on an API (application programming interface). If you want (and have the possibility), you can use an API for purposes of converting PDF files to Excel. Using an API certainly has advantages over some of the methods that I describe above. For example:
- An API allows you to better handle potential timing issues, such as those that arise when relying on the Shell Function and SendKeys.
- Depending on which API you’re using, you may able to get more accurate conversions (particularly) than those made through Microsoft Word.
As I show above, Word’s PDF Reflow feature works well in several scenarios. However, there are some situations where a more specialized PDF converter (such as Able2Extract) provides better and more accurate results.
I may write about the topic of PDF to Excel conversion using APIs in future posts. If you want to receive an email whenever I publish new material in Power Spreadsheets, please make sure to enter your email address below now:
Books And Resources Referenced In This Excel VBA Tutorial
- Alexander, Michael and Kusleika, Dick (2016). Excel 2016 Power Programming with VBA. Indianapolis, IN: John Wiley & Sons Inc.
- Dougherty, Christopher (2011). Introduction to Econometrics. New York City, NY: Oxford University Press.
- Mansfield, Richard (2016). Mastering VBA for Microsoft Office 2016. Indianapolis, IN: John Wiley & Sons Inc.
The following link is an affiliate link, which means that if you choose to make a purchase, I will earn a commission. This commission comes at no additional cost to you.
- Able2Extract.
Извлечение таблиц из pdf (в excel), прив. w/vba
Я пытаюсь извлечь таблицы из pdf файлов с помощью vba и экспортировать их в excel. Если все работает так, как должно, оно должно идти автоматически. Проблема в том, что таблица не стандартизирована.
Это то, что у меня есть до сих пор.
- VBA (Excel) запускает XPDF и преобразует все .pdf файлы, найденные в текущей папке, в текстовый файл.
- VBA (Excel) читает каждый текстовый файл по строкам.
И код:
With New Scripting.FileSystemObject
With .OpenTextFile(strFileName, 1, False, 0)
If Not .AtEndOfStream Then .SkipLine
Do Until .AtEndOfStream
//do something
Loop
End With
End With
Все это прекрасно работает. Но теперь я сталкиваюсь с проблемой извлечения таблиц из текстовых файлов.То, что я пытаюсь сделать, это VBA, чтобы найти строку, например. «Year Income», а затем вывести данные после него в столбцы. (Пока таблица не закончится.)
Первая часть не очень сложна (найдите определенную строку), но как бы я обошел вторую часть. Текстовый файл будет выглядеть как этот Pastebin . Проблема в том, что текст не стандартизирован. Таким образом, например, некоторые таблицы имеют 3-летние столбцы (2010 2011 2012), а некоторые только две (или 1), некоторые таблицы имеют больше пробелов между столбцом, а некоторые не включают определенные строки (такие как Capital Asset, net).
Я думал о том, чтобы делать что-то подобное, но не уверен, как это сделать в VBA.
- Найти строку, определенную пользователем. например. «Таблица 1: Возврат лет».
- а. Следующая строка найти годы; если есть два, нам понадобятся три столбца на выходе (заголовки +, 2x год), если их три, нам понадобятся четыре (титры +, 3 раза в год).. и т.д.
б. Создайте столбец столбца + столбец за каждый год. - При достижении конца строки перейдите к следующей строке
- а. Читать текст → вывод в столбец 1.
б. Распознавать пробелы (пробелы > 3?) Как начало столбца 2. Чтение чисел → вывод в столбец 2.
с. (если столбец = 3) Признать пробелы как начало столбца 3. Чтение чисел → вывод в столбец 3.
д. (если столбец = 4) Распознавать пробелы как начало столбца 4. Считать числа → вывод в столбец 4. - Каждая строка, цикл 4.
- Следующая строка не включает числа — Конечная таблица. (возможно, easiet только пользовательский номер, после 15 символов нет номера конечной таблицы)
Я основывал свою первую версию на Pdf, чтобы преуспеть , но чтение онлайн-людей не рекомендуется OpenFile, а скорее FileSystemObject (хотя это, кажется, намного медленнее).
Любые указатели, чтобы начать меня, главным образом на шаге 2?
6
vba excel pdf filesystemobject
задан MeRuud 23 февр. ’13 в 23:43 источник поделиться
2 ответов
У вас есть несколько способов проанализировать текстовый файл и в зависимости от того, насколько сложно это может привести к тому, что вы опираетесь так или иначе. Я начал это, и он немного вышел из рук… наслаждайтесь.
На основе предоставленного вами примера и дополнительных комментариев я отметил следующее. Некоторые из них могут хорошо работать для простых файлов, но могут стать неудобными с более сложными файлами. Кроме того, могут быть несколько более эффективные методы или трюки к тому, что я использовал здесь, но это, безусловно, поможет вам достичь желаемого результата. Надеюсь, это имеет смысл в сочетании с предоставленным кодом:
- Вы можете использовать логические значения, чтобы помочь вам определить, какой «раздел» текстового файла вы находитесь. Т.е. используйте InStr в текущей строке, чтобыопределите, что вы находитесь в таблице, ища текст «Таблица», а затемкак только вы знаете, что находитесь в разделе «Таблица» начала файлаищет раздел «Активы» и т.д.
- Вы можете использовать несколько методов для определения количества лет (или столбцов), которое у вас есть. Функция Split вместе с циклом будетработа.
- Если ваши файлы всегда имеют постоянное форматирование, даже в определенных частях, вы можете воспользоваться этим. Например, если вы знаетестрока файла всегда будет иметь знак доллара перед ними, затемвы знаете, что это определит ширину столбцов, и вы можете использовать это напоследующие строки текста.
Следующий код будет извлекать детали активов из текстового файла, вы можете модифицировать его для извлечения других разделов. Он должен обрабатывать несколько строк. Надеюсь, я прокомментировал это достаточно. Посмотрите, и я отредактирую, если вам нужно помочь.
Sub ReadInTextFile()
Dim fs As Scripting.FileSystemObject, fsFile As Scripting.TextStream
Dim sFileName As String, sLine As String, vYears As Variant
Dim iNoColumns As Integer, ii As Integer, iCount As Integer
Dim bIsTable As Boolean, bIsAssets As Boolean, bIsLiabilities As Boolean, bIsNetAssets As Boolean
Set fs = CreateObject(«Scripting.FileSystemObject»)
sFileName = «G:Sample.txt»
Set fsFile = fs.OpenTextFile(sFileName, 1, False)
‘Loop through the file as you’ve already done
Do While fsFile.AtEndOfStream <> True
‘Determine flag positions in text file
sLine = fsFile.Readline
Debug.Print VBA.Len(sLine)
‘Always skip empty lines (including single spaceS)
If VBA.Len(sLine) > 1 Then
‘We’ve found a new table so we can reset the booleans
If VBA.InStr(1, sLine, «Table») > 0 Then
bIsTable = True
bIsAssets = False
bIsNetAssets = False
bIsLiabilities = False
iNoColumns = 0
End If
‘Perhaps you want to also have some sort of way to designate that a table has finished. Like so
If VBA.Instr(1, sLine, «Some text that designates the end of the table») Then
bIsTable = False
End If
‘If we’re in the table section then we want to read in the data
If bIsTable Then
‘Check for your different sections. You could make this constant if your text file allowed it.
If VBA.InStr(1, sLine, «Assets») > 0 And VBA.InStr(1, sLine, «Net») = 0 Then bIsAssets = True: bIsLiabilities = False: bIsNetAssets = False
If VBA.InStr(1, sLine, «Liabilities») > 0 Then bIsAssets = False: bIsLiabilities = True: bIsNetAssets = False
If VBA.InStr(1, sLine, «Net Assests») > 0 Then bIsAssets = True: bIsLiabilities = False: bIsNetAssets = True
‘If we haven’t triggered any of these booleans then we’re at the column headings
If Not bIsAssets And Not bIsLiabilities And Not bIsNetAssets And VBA.InStr(1, sLine, «Table») = 0 Then
‘Trim the current line to remove leading and trailing spaces then use the split function to determine the number of years
vYears = VBA.Split(VBA.Trim$(sLine), » «)
For ii = LBound(vYears) To UBound(vYears)
If VBA.Len(vYears(ii)) > 0 Then iNoColumns = iNoColumns + 1
Next ii
‘Now we can redefine some variables to hold the information (you’ll want to redim after you’ve collected the info)
ReDim sAssets(1 To iNoColumns + 1, 1 To 100) As String
ReDim iColumns(1 To iNoColumns) As Integer
Else
If bIsAssets Then
‘Skip the heading line
If Not VBA.Trim$(sLine) = «Assets» Then
‘Increment the counter
iCount = iCount + 1
‘If iCount reaches it limit you’ll have to redim preseve you sAssets array (I’ll leave this to you)
If iCount > 99 Then
‘You’ll find other posts on stackoverflow to do this
End If
‘This will happen on the first row, it’ll happen everytime you
‘hit a $ sign but you could code to only do so the first time
If VBA.InStr(1, sLine, «$») > 0 Then
iColumns(1) = VBA.InStr(1, sLine, «$»)
For ii = 2 To iNoColumns
‘We need to start at the next character across
iColumns(ii) = VBA.InStr(iColumns(ii — 1) + 1, sLine, «$»)
Next ii
End If
‘The first part (the name) is simply up to the $ sign (trimmed of spaces)
sAssets(1, iCount) = VBA.Trim$(VBA.Mid$(sLine, 1, iColumns(1) — 1))
For ii = 2 To iNoColumns
‘Then we can loop around for the rest
sAssets(ii, iCount) = VBA.Trim$(VBA.Mid$(sLine, iColumns(ii) + 1, iColumns(ii) — iColumns(ii — 1)))
Next ii
‘Now do the last column
If VBA.Len(sLine) > iColumns(iNoColumns) Then
sAssets(iNoColumns + 1, iCount) = VBA.Trim$(VBA.Right$(sLine, VBA.Len(sLine) — iColumns(iNoColumns)))
End If
Else
‘Reset the counter
iCount = 0
End If
End If
End If
End If
End If
Loop
‘Clean up
fsFile.Close
Set fsFile = Nothing
Set fs = Nothing
End Sub
2
ответ дан CuberChase 24 февр. ’13 в 11:03 источник поделиться
Я не могу проверить образцы данных, поскольку PasteBin был удален. Основываясь на том, что я могу почерпнуть из описания проблемы, мне кажется, что использование регулярных выражений упростит анализ данных.
Добавьте ссылку на скрипт Runtime scrrun.dll для FileSystemObject.
Добавьте ссылку на регулярные выражения Microsoft VBScript 5.5. библиотека для объекта RegExp.
Создайте экземпляр объекта RegEx с помощью Dim objRE As New RegExp
Задайте для свойства Pattern значение «( bd {4}b) {1,3}» Вышеупомянутый шаблон должен соответствовать строкам, содержащим строки, такие как:20102010 20112010 2011 2012
Число пробелов между строками года не имеет значения, если существует хотя бы один (поскольку мы не ожидаем, что будем сталкиваться с такими строками, как 201020112012)
Установите для свойства Global значение True
Захваченные группы будут найдены в отдельных объектах Match из MatchCollection, возвращенных методом Execute объекта OBRERE объекта RegEx. Поэтому объявите соответствующие объекты:
Dim objMatches as MatchCollection
Dim objMatch as Match
Dim intMatchCount ‘tells you how many year strings were found, if any
Предполагая, что вы создали объект FileSystemObject и просматриваете текстовый файл, считывая каждую строку в переменной strLine
Первый тест, чтобы увидеть, содержит ли текущая строка искомый шаблон:
If objRE.Test(strLine) Then
‘do something
Else
‘skip over this line
End If
Set objMatches = objRe.Execute(strLine)
intMatchCount = objMatches.Count
For i = 0 To intMatchCount — 1
‘processing code such as writing the years as column headings in Excel
Set objMatch = objMatches(i)
e.g. ActiveCell.Value = objMatch.Value
‘subsequent lines beneath the line containing the year strings should
‘have the amounts, which may be captured in a similar fashion using an
‘additional RegExp object and a Pattern such as «(bd+b){1,3}» for
‘whole numbers or «(bd+.d+b){1,3}» for floats. For currency, you
‘can use «(b$d+.d{2}b){1,3}»
Next i
Это всего лишь приблизительная схема того, как я подхожу к этой проблеме. Я надеюсь, что в этом коде есть что-то, что поможет вам.